Compare commits

...

No commits in common. "c8s-stream-4.0" and "c8-stream-1.0" have entirely different histories.

10 changed files with 732 additions and 254 deletions

2
.gitignore vendored
View File

@ -1 +1 @@
SOURCES/v1.1.4.tar.gz
SOURCES/runc-2abd837.tar.gz

View File

@ -1 +1 @@
fb65327930c41c8ec016badd6738bef83b556aed SOURCES/v1.1.4.tar.gz
cf7119a838db2963e7af6ecdba90a2cc95ec0d56 SOURCES/runc-2abd837.tar.gz

View File

@ -0,0 +1,62 @@
From dfb3496c174377b860b62872ce6af951364cc3ac Mon Sep 17 00:00:00 2001
From: Lokesh Mandvekar <lsm5@fedoraproject.org>
Date: Tue, 12 Dec 2017 13:22:42 +0530
Subject: [PATCH] Revert "Apply cgroups earlier"
This reverts commit 7062c7556b71188abc18d7516441ff4b03fbc1fc.
---
libcontainer/process_linux.go | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/libcontainer/process_linux.go b/libcontainer/process_linux.go
index 149b1126..b8a395af 100644
--- a/libcontainer/process_linux.go
+++ b/libcontainer/process_linux.go
@@ -272,6 +272,20 @@ func (p *initProcess) start() error {
p.process.ops = nil
return newSystemErrorWithCause(err, "starting init process command")
}
+ if _, err := io.Copy(p.parentPipe, p.bootstrapData); err != nil {
+ return newSystemErrorWithCause(err, "copying bootstrap data to pipe")
+ }
+ if err := p.execSetns(); err != nil {
+ return newSystemErrorWithCause(err, "running exec setns process for init")
+ }
+ // Save the standard descriptor names before the container process
+ // can potentially move them (e.g., via dup2()). If we don't do this now,
+ // we won't know at checkpoint time which file descriptor to look up.
+ fds, err := getPipeFds(p.pid())
+ if err != nil {
+ return newSystemErrorWithCausef(err, "getting pipe fds for pid %d", p.pid())
+ }
+ p.setExternalDescriptors(fds)
// Do this before syncing with child so that no children can escape the
// cgroup. We don't need to worry about not doing this and not being root
// because we'd be using the rootless cgroup manager in that case.
@@ -292,23 +306,6 @@ func (p *initProcess) start() error {
}
}
}()
-
- if _, err := io.Copy(p.parentPipe, p.bootstrapData); err != nil {
- return newSystemErrorWithCause(err, "copying bootstrap data to pipe")
- }
-
- if err := p.execSetns(); err != nil {
- return newSystemErrorWithCause(err, "running exec setns process for init")
- }
-
- // Save the standard descriptor names before the container process
- // can potentially move them (e.g., via dup2()). If we don't do this now,
- // we won't know at checkpoint time which file descriptor to look up.
- fds, err := getPipeFds(p.pid())
- if err != nil {
- return newSystemErrorWithCausef(err, "getting pipe fds for pid %d", p.pid())
- }
- p.setExternalDescriptors(fds)
if err := p.createNetworkInterfaces(); err != nil {
return newSystemErrorWithCause(err, "creating network interfaces")
}
--
2.14.3

View File

@ -0,0 +1,290 @@
From bf6405284aa3870a39b402309003633a1c230ed9 Mon Sep 17 00:00:00 2001
From: Aleksa Sarai <asarai@suse.de>
Date: Wed, 9 Jan 2019 13:40:01 +1100
Subject: [PATCH 1/1] nsenter: clone /proc/self/exe to avoid exposing host
binary to container
There are quite a few circumstances where /proc/self/exe pointing to a
pretty important container binary is a _bad_ thing, so to avoid this we
have to make a copy (preferably doing self-clean-up and not being
writeable).
As a hotfix we require memfd_create(2), but we can always extend this to
use a scratch MNT_DETACH overlayfs or tmpfs. The main downside to this
approach is no page-cache sharing for the runc binary (which overlayfs
would give us) but this is far less complicated.
This is only done during nsenter so that it happens transparently to the
Go code, and any libcontainer users benefit from it. This also makes
ExtraFiles and --preserve-fds handling trivial (because we don't need to
worry about it).
Fixes: CVE-2019-5736
Co-developed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
---
libcontainer/nsenter/cloned_binary.c | 221 +++++++++++++++++++++++++++
libcontainer/nsenter/nsexec.c | 11 ++
2 files changed, 232 insertions(+)
create mode 100644 libcontainer/nsenter/cloned_binary.c
diff --git a/libcontainer/nsenter/cloned_binary.c b/libcontainer/nsenter/cloned_binary.c
new file mode 100644
index 00000000..d9f6093a
--- /dev/null
+++ b/libcontainer/nsenter/cloned_binary.c
@@ -0,0 +1,221 @@
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <errno.h>
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/vfs.h>
+#include <sys/mman.h>
+#include <sys/sendfile.h>
+#include <sys/syscall.h>
+
+#include <linux/magic.h>
+#include <linux/memfd.h>
+
+/* Use our own wrapper for memfd_create. */
+#if !defined(SYS_memfd_create) && defined(__NR_memfd_create)
+# define SYS_memfd_create __NR_memfd_create
+#endif
+#ifndef SYS_memfd_create
+# error "memfd_create(2) syscall not supported by this glibc version"
+#endif
+int memfd_create(const char *name, unsigned int flags)
+{
+ return syscall(SYS_memfd_create, name, flags);
+}
+
+/* This comes directly from <linux/fcntl.h>. */
+#ifndef F_LINUX_SPECIFIC_BASE
+# define F_LINUX_SPECIFIC_BASE 1024
+#endif
+#ifndef F_ADD_SEALS
+# define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
+# define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
+#endif
+#ifndef F_SEAL_SEAL
+# define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
+# define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
+# define F_SEAL_GROW 0x0004 /* prevent file from growing */
+# define F_SEAL_WRITE 0x0008 /* prevent writes */
+#endif
+
+
+#define OUR_MEMFD_COMMENT "runc_cloned:/proc/self/exe"
+#define OUR_MEMFD_SEALS \
+ (F_SEAL_SEAL | F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE)
+
+static void *must_realloc(void *ptr, size_t size)
+{
+ void *old = ptr;
+ do {
+ ptr = realloc(old, size);
+ } while(!ptr);
+ return ptr;
+}
+
+/*
+ * Verify whether we are currently in a self-cloned program (namely, is
+ * /proc/self/exe a memfd). F_GET_SEALS will only succeed for memfds (or rather
+ * for shmem files), and we want to be sure it's actually sealed.
+ */
+static int is_self_cloned(void)
+{
+ int fd, seals;
+
+ fd = open("/proc/self/exe", O_RDONLY|O_CLOEXEC);
+ if (fd < 0)
+ return -ENOTRECOVERABLE;
+
+ seals = fcntl(fd, F_GET_SEALS);
+ close(fd);
+ return seals == OUR_MEMFD_SEALS;
+}
+
+/*
+ * Basic wrapper around mmap(2) that gives you the file length so you can
+ * safely treat it as an ordinary buffer. Only gives you read access.
+ */
+static char *read_file(char *path, size_t *length)
+{
+ int fd;
+ char buf[4096], *copy = NULL;
+
+ if (!length)
+ return NULL;
+
+ fd = open(path, O_RDONLY | O_CLOEXEC);
+ if (fd < 0)
+ return NULL;
+
+ *length = 0;
+ for (;;) {
+ int n;
+
+ n = read(fd, buf, sizeof(buf));
+ if (n < 0)
+ goto error;
+ if (!n)
+ break;
+
+ copy = must_realloc(copy, (*length + n) * sizeof(*copy));
+ memcpy(copy + *length, buf, n);
+ *length += n;
+ }
+ close(fd);
+ return copy;
+
+error:
+ close(fd);
+ free(copy);
+ return NULL;
+}
+
+/*
+ * A poor-man's version of "xargs -0". Basically parses a given block of
+ * NUL-delimited data, within the given length and adds a pointer to each entry
+ * to the array of pointers.
+ */
+static int parse_xargs(char *data, int data_length, char ***output)
+{
+ int num = 0;
+ char *cur = data;
+
+ if (!data || *output != NULL)
+ return -1;
+
+ while (cur < data + data_length) {
+ num++;
+ *output = must_realloc(*output, (num + 1) * sizeof(**output));
+ (*output)[num - 1] = cur;
+ cur += strlen(cur) + 1;
+ }
+ (*output)[num] = NULL;
+ return num;
+}
+
+/*
+ * "Parse" out argv and envp from /proc/self/cmdline and /proc/self/environ.
+ * This is necessary because we are running in a context where we don't have a
+ * main() that we can just get the arguments from.
+ */
+static int fetchve(char ***argv, char ***envp)
+{
+ char *cmdline = NULL, *environ = NULL;
+ size_t cmdline_size, environ_size;
+
+ cmdline = read_file("/proc/self/cmdline", &cmdline_size);
+ if (!cmdline)
+ goto error;
+ environ = read_file("/proc/self/environ", &environ_size);
+ if (!environ)
+ goto error;
+
+ if (parse_xargs(cmdline, cmdline_size, argv) <= 0)
+ goto error;
+ if (parse_xargs(environ, environ_size, envp) <= 0)
+ goto error;
+
+ return 0;
+
+error:
+ free(environ);
+ free(cmdline);
+ return -EINVAL;
+}
+
+#define SENDFILE_MAX 0x7FFFF000 /* sendfile(2) is limited to 2GB. */
+static int clone_binary(void)
+{
+ int binfd, memfd, err;
+ ssize_t sent = 0;
+
+ memfd = memfd_create(OUR_MEMFD_COMMENT, MFD_CLOEXEC | MFD_ALLOW_SEALING);
+ if (memfd < 0)
+ return -ENOTRECOVERABLE;
+
+ binfd = open("/proc/self/exe", O_RDONLY | O_CLOEXEC);
+ if (binfd < 0)
+ goto error;
+
+ sent = sendfile(memfd, binfd, NULL, SENDFILE_MAX);
+ close(binfd);
+ if (sent < 0)
+ goto error;
+
+ err = fcntl(memfd, F_ADD_SEALS, OUR_MEMFD_SEALS);
+ if (err < 0)
+ goto error;
+
+ return memfd;
+
+error:
+ close(memfd);
+ return -EIO;
+}
+
+int ensure_cloned_binary(void)
+{
+ int execfd;
+ char **argv = NULL, **envp = NULL;
+
+ /* Check that we're not self-cloned, and if we are then bail. */
+ int cloned = is_self_cloned();
+ if (cloned > 0 || cloned == -ENOTRECOVERABLE)
+ return cloned;
+
+ if (fetchve(&argv, &envp) < 0)
+ return -EINVAL;
+
+ execfd = clone_binary();
+ if (execfd < 0)
+ return -EIO;
+
+ fexecve(execfd, argv, envp);
+ return -ENOEXEC;
+}
diff --git a/libcontainer/nsenter/nsexec.c b/libcontainer/nsenter/nsexec.c
index cb224314..784fd9b0 100644
--- a/libcontainer/nsenter/nsexec.c
+++ b/libcontainer/nsenter/nsexec.c
@@ -528,6 +528,9 @@ void join_namespaces(char *nslist)
free(namespaces);
}
+/* Defined in cloned_binary.c. */
+int ensure_cloned_binary(void);
+
void nsexec(void)
{
int pipenum;
@@ -543,6 +546,14 @@ void nsexec(void)
if (pipenum == -1)
return;
+ /*
+ * We need to re-exec if we are not in a cloned binary. This is necessary
+ * to ensure that containers won't be able to access the host binary
+ * through /proc/self/exe. See CVE-2019-5736.
+ */
+ if (ensure_cloned_binary() < 0)
+ bail("could not ensure we are a cloned binary");
+
/* Parse all of the netlink configuration. */
nl_parse(pipenum, &config);
--
2.20.1

200
SOURCES/1807.patch Normal file
View File

@ -0,0 +1,200 @@
From ecf53c23545092019602578583031c28fde4d2a1 Mon Sep 17 00:00:00 2001
From: Giuseppe Scrivano <gscrivan@redhat.com>
Date: Fri, 25 May 2018 18:04:06 +0200
Subject: [PATCH] sd-notify: do not hang when NOTIFY_SOCKET is used with create
if NOTIFY_SOCKET is used, do not block the main runc process waiting
for events on the notify socket. Change the logic to create a new
process that monitors exclusively the notify socket until an event is
received.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
---
init.go | 12 +++++++
notify_socket.go | 101 ++++++++++++++++++++++++++++++++++++++++++++++---------
signals.go | 5 +--
3 files changed, 99 insertions(+), 19 deletions(-)
diff --git a/init.go b/init.go
index c8f453192..6a3d9e91c 100644
--- a/init.go
+++ b/init.go
@@ -20,6 +20,18 @@ var initCommand = cli.Command{
Name: "init",
Usage: `initialize the namespaces and launch the process (do not call it outside of runc)`,
Action: func(context *cli.Context) error {
+ // If NOTIFY_SOCKET is used create a new process that stays around
+ // so to not block "runc start". It will automatically exits when the
+ // container notifies that it is ready, or when the container is deleted
+ if os.Getenv("_NOTIFY_SOCKET_FD") != "" {
+ fd := os.Getenv("_NOTIFY_SOCKET_FD")
+ pid := os.Getenv("_NOTIFY_SOCKET_PID")
+ hostNotifySocket := os.Getenv("_NOTIFY_SOCKET_HOST")
+ notifySocketPath := os.Getenv("_NOTIFY_SOCKET_PATH")
+ notifySocketInit(fd, pid, hostNotifySocket, notifySocketPath)
+ os.Exit(0)
+ }
+
factory, _ := libcontainer.New("")
if err := factory.StartInitialization(); err != nil {
// as the error is sent back to the parent there is no need to log
diff --git a/notify_socket.go b/notify_socket.go
index cd6c0a989..e04e9d660 100644
--- a/notify_socket.go
+++ b/notify_socket.go
@@ -6,10 +6,13 @@ import (
"bytes"
"fmt"
"net"
+ "os"
+ "os/exec"
"path/filepath"
+ "strconv"
+ "time"
"github.com/opencontainers/runtime-spec/specs-go"
-
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
)
@@ -64,24 +67,94 @@ func (s *notifySocket) setupSocket() error {
return nil
}
+func (notifySocket *notifySocket) notifyNewPid(pid int) {
+ notifySocketHostAddr := net.UnixAddr{Name: notifySocket.host, Net: "unixgram"}
+ client, err := net.DialUnix("unixgram", nil, &notifySocketHostAddr)
+ if err != nil {
+ return
+ }
+ newPid := fmt.Sprintf("MAINPID=%d\n", pid)
+ client.Write([]byte(newPid))
+}
+
// pid1 must be set only with -d, as it is used to set the new process as the main process
// for the service in systemd
func (notifySocket *notifySocket) run(pid1 int) {
- buf := make([]byte, 512)
- notifySocketHostAddr := net.UnixAddr{Name: notifySocket.host, Net: "unixgram"}
- client, err := net.DialUnix("unixgram", nil, &notifySocketHostAddr)
+ file, err := notifySocket.socket.File()
if err != nil {
logrus.Error(err)
return
}
- for {
- r, err := notifySocket.socket.Read(buf)
- if err != nil {
- break
+ defer file.Close()
+ defer notifySocket.socket.Close()
+
+ cmd := exec.Command("/proc/self/exe", "init")
+ cmd.ExtraFiles = []*os.File{file}
+ cmd.Env = append(cmd.Env, "_NOTIFY_SOCKET_FD=3",
+ fmt.Sprintf("_NOTIFY_SOCKET_PID=%d", pid1),
+ fmt.Sprintf("_NOTIFY_SOCKET_HOST=%s", notifySocket.host),
+ fmt.Sprintf("_NOTIFY_SOCKET_PATH=%s", notifySocket.socketPath))
+
+ if err := cmd.Start(); err != nil {
+ logrus.Fatal(err)
+ }
+ notifySocket.notifyNewPid(cmd.Process.Pid)
+ cmd.Process.Release()
+}
+
+func notifySocketInit(envFd string, envPid string, notifySocketHost string, notifySocketPath string) {
+ intFd, err := strconv.Atoi(envFd)
+ if err != nil {
+ return
+ }
+ pid1, err := strconv.Atoi(envPid)
+ if err != nil {
+ return
+ }
+
+ file := os.NewFile(uintptr(intFd), "unixgram")
+ defer file.Close()
+
+ fileChan := make(chan []byte)
+ exitChan := make(chan bool)
+
+ go func() {
+ for {
+ buf := make([]byte, 512)
+ r, err := file.Read(buf)
+ if err != nil {
+ return
+ }
+ fileChan <- buf[0:r]
}
- var out bytes.Buffer
- for _, line := range bytes.Split(buf[0:r], []byte{'\n'}) {
- if bytes.HasPrefix(line, []byte("READY=")) {
+ }()
+ go func() {
+ for {
+ if _, err := os.Stat(notifySocketPath); os.IsNotExist(err) {
+ exitChan <- true
+ return
+ }
+ time.Sleep(time.Second)
+ }
+ }()
+
+ notifySocketHostAddr := net.UnixAddr{Name: notifySocketHost, Net: "unixgram"}
+ client, err := net.DialUnix("unixgram", nil, &notifySocketHostAddr)
+ if err != nil {
+ return
+ }
+
+ for {
+ select {
+ case <-exitChan:
+ return
+ case b := <-fileChan:
+ for _, line := range bytes.Split(b, []byte{'\n'}) {
+ if !bytes.HasPrefix(line, []byte("READY=")) {
+ continue
+ }
+
+ var out bytes.Buffer
_, err = out.Write(line)
if err != nil {
return
@@ -98,10 +171,8 @@ func (notifySocket *notifySocket) run(pid1 int) {
}
// now we can inform systemd to use pid1 as the pid to monitor
- if pid1 > 0 {
- newPid := fmt.Sprintf("MAINPID=%d\n", pid1)
- client.Write([]byte(newPid))
- }
+ newPid := fmt.Sprintf("MAINPID=%d\n", pid1)
+ client.Write([]byte(newPid))
return
}
}
diff --git a/signals.go b/signals.go
index 1811de837..d0988cb39 100644
--- a/signals.go
+++ b/signals.go
@@ -70,7 +70,7 @@ func (h *signalHandler) forward(process *libcontainer.Process, tty *tty, detach
h.notifySocket.run(pid1)
return 0, nil
} else {
- go h.notifySocket.run(0)
+ h.notifySocket.run(os.Getpid())
}
}
@@ -98,9 +98,6 @@ func (h *signalHandler) forward(process *libcontainer.Process, tty *tty, detach
// status because we must ensure that any of the go specific process
// fun such as flushing pipes are complete before we return.
process.Wait()
- if h.notifySocket != nil {
- h.notifySocket.Close()
- }
return e.status, nil
}
}

View File

@ -1,84 +0,0 @@
From 2ce40b6ad72b4bd4391380cafc5ef1bad1fa0b31 Mon Sep 17 00:00:00 2001
From: Kir Kolyshkin <kolyshkin@gmail.com>
Date: Wed, 4 May 2022 14:56:16 -0700
Subject: [PATCH] Remove tun/tap from the default device rules
Looking through git blame, this was added by commit 9fac18329
aka "Initial commit of runc binary", most probably by mistake.
Obviously, a container should not have access to tun/tap device, unless
it is explicitly specified in configuration.
Now, removing this might create a compatibility issue, but I see no
other choice.
Aside from the obvious misconfiguration, this should also fix the
annoying
> Apr 26 03:46:56 foo.bar systemd[1]: Couldn't stat device /dev/char/10:200: No such file or directory
messages from systemd on every container start, when runc uses systemd
cgroup driver, and the system runs an old (< v240) version of systemd
(the message was presumably eliminated by [1]).
[1] https://github.com/systemd/systemd/pull/10996/commits/d5aecba6e0b7c73657c4cf544ce57289115098e7
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
---
.../ebpf/devicefilter/devicefilter_test.go | 19 ++++++-------------
libcontainer/specconv/spec_linux.go | 10 ----------
2 files changed, 6 insertions(+), 23 deletions(-)
diff --git a/libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go b/libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go
index d279335821..25703be5ad 100644
--- a/libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go
+++ b/libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go
@@ -120,21 +120,14 @@ block-8:
51: Mov32Imm dst: r0 imm: 1
52: Exit
block-9:
-// tuntap (c, 10, 200, rwm, allow)
+// /dev/pts (c, 136, wildcard, rwm, true)
53: JNEImm dst: r2 off: -1 imm: 2 <block-10>
- 54: JNEImm dst: r4 off: -1 imm: 10 <block-10>
- 55: JNEImm dst: r5 off: -1 imm: 200 <block-10>
- 56: Mov32Imm dst: r0 imm: 1
- 57: Exit
+ 54: JNEImm dst: r4 off: -1 imm: 136 <block-10>
+ 55: Mov32Imm dst: r0 imm: 1
+ 56: Exit
block-10:
-// /dev/pts (c, 136, wildcard, rwm, true)
- 58: JNEImm dst: r2 off: -1 imm: 2 <block-11>
- 59: JNEImm dst: r4 off: -1 imm: 136 <block-11>
- 60: Mov32Imm dst: r0 imm: 1
- 61: Exit
-block-11:
- 62: Mov32Imm dst: r0 imm: 0
- 63: Exit
+ 57: Mov32Imm dst: r0 imm: 0
+ 58: Exit
`
var devices []*devices.Rule
for _, device := range specconv.AllowedDevices {
diff --git a/libcontainer/specconv/spec_linux.go b/libcontainer/specconv/spec_linux.go
index 5ae95c6c18..83c7a2c348 100644
--- a/libcontainer/specconv/spec_linux.go
+++ b/libcontainer/specconv/spec_linux.go
@@ -302,16 +302,6 @@ var AllowedDevices = []*devices.Device{
Allow: true,
},
},
- // tuntap
- {
- Rule: devices.Rule{
- Type: devices.CharDevice,
- Major: 10,
- Minor: 200,
- Permissions: "rwm",
- Allow: true,
- },
- },
}
type CreateOpts struct {

View File

@ -0,0 +1 @@
fs.may_detach_mounts=1

View File

@ -0,0 +1,61 @@
diff --git a/list.go b/list.go
index 0313d8c..328798b 100644
--- a/list.go
+++ b/list.go
@@ -50,7 +50,7 @@ var listCommand = cli.Command{
ArgsUsage: `
Where the given root is specified via the global option "--root"
-(default: "/run/runc").
+(default: "/run/runc-ctrs").
EXAMPLE 1:
To list containers created via the default "--root":
diff --git a/main.go b/main.go
index 278399a..0f49fce 100644
--- a/main.go
+++ b/main.go
@@ -62,7 +62,7 @@ func main() {
v = append(v, fmt.Sprintf("spec: %s", specs.Version))
app.Version = strings.Join(v, "\n")
- root := "/run/runc"
+ root := "/run/runc-ctrs"
rootless, err := isRootless(nil)
if err != nil {
fatal(err)
@@ -70,7 +70,7 @@ func main() {
if rootless {
runtimeDir := os.Getenv("XDG_RUNTIME_DIR")
if runtimeDir != "" {
- root = runtimeDir + "/runc"
+ root = runtimeDir + "/runc-ctrs"
// According to the XDG specification, we need to set anything in
// XDG_RUNTIME_DIR to have a sticky bit if we don't want it to get
// auto-pruned.
diff --git a/man/runc-list.8.md b/man/runc-list.8.md
index f737424..107220e 100644
--- a/man/runc-list.8.md
+++ b/man/runc-list.8.md
@@ -6,7 +6,7 @@
# EXAMPLE
Where the given root is specified via the global option "--root"
-(default: "/run/runc").
+(default: "/run/runc-ctrs").
To list containers created via the default "--root":
# runc list
diff --git a/man/runc.8.md b/man/runc.8.md
index 6d0ddff..337bc73 100644
--- a/man/runc.8.md
+++ b/man/runc.8.md
@@ -51,7 +51,7 @@ value for "bundle" is the current directory.
--debug enable debug output for logging
--log value set the log file path where internal debug information is written (default: "/dev/null")
--log-format value set the format used by logs ('text' (default), or 'json') (default: "text")
- --root value root directory for storage of container state (this should be located in tmpfs) (default: "/run/runc" or $XDG_RUNTIME_DIR/runc for rootless containers)
+ --root value root directory for storage of container state (this should be located in tmpfs) (default: "/run/runc-ctrs" or $XDG_RUNTIME_DIR/runc-ctrs for rootless containers)
--criu value path to the criu binary used for checkpoint and restore (default: "criu")
--systemd-cgroup enable systemd cgroup support, expects cgroupsPath to be of form "slice:prefix:name" for e.g. "system.slice:runc:434234"
--rootless value enable rootless mode ('true', 'false', or 'auto') (default: "auto")

72
SOURCES/pivot-root.patch Normal file
View File

@ -0,0 +1,72 @@
From 28a697cce3e4f905dca700eda81d681a30eef9cd Mon Sep 17 00:00:00 2001
From: Giuseppe Scrivano <gscrivan@redhat.com>
Date: Fri, 11 Jan 2019 21:53:45 +0100
Subject: [PATCH] rootfs: umount all procfs and sysfs with --no-pivot
When creating a new user namespace, the kernel doesn't allow to mount
a new procfs or sysfs file system if there is not already one instance
fully visible in the current mount namespace.
When using --no-pivot we were effectively inhibiting this protection
from the kernel, as /proc and /sys from the host are still present in
the container mount namespace.
A container without full access to /proc could then create a new user
namespace, and from there able to mount a fully visible /proc, bypassing
the limitations in the container.
A simple reproducer for this issue is:
unshare -mrfp sh -c "mount -t proc none /proc && echo c > /proc/sysrq-trigger"
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
---
libcontainer/rootfs_linux.go | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/libcontainer/rootfs_linux.go b/libcontainer/rootfs_linux.go
index e7c2f8ada..6bd6da74a 100644
--- a/libcontainer/rootfs_linux.go
+++ b/libcontainer/rootfs_linux.go
@@ -748,6 +748,41 @@ func pivotRoot(rootfs string) error {
}
func msMoveRoot(rootfs string) error {
+ mountinfos, err := mount.GetMounts()
+ if err != nil {
+ return err
+ }
+
+ absRootfs, err := filepath.Abs(rootfs)
+ if err != nil {
+ return err
+ }
+
+ for _, info := range mountinfos {
+ p, err := filepath.Abs(info.Mountpoint)
+ if err != nil {
+ return err
+ }
+ // Umount every syfs and proc file systems, except those under the container rootfs
+ if (info.Fstype != "proc" && info.Fstype != "sysfs") || filepath.HasPrefix(p, absRootfs) {
+ continue
+ }
+ // Be sure umount events are not propagated to the host.
+ if err := unix.Mount("", p, "", unix.MS_SLAVE|unix.MS_REC, ""); err != nil {
+ return err
+ }
+ if err := unix.Unmount(p, unix.MNT_DETACH); err != nil {
+ if err != unix.EINVAL && err != unix.EPERM {
+ return err
+ } else {
+ // If we have not privileges for umounting (e.g. rootless), then
+ // cover the path.
+ if err := unix.Mount("tmpfs", p, "tmpfs", 0, ""); err != nil {
+ return err
+ }
+ }
+ }
+ }
if err := unix.Mount(rootfs, "/", "", unix.MS_MOVE, ""); err != nil {
return err
}

View File

@ -1,46 +1,52 @@
%global with_debug 1
%global with_bundled 1
%global with_check 0
%if 0%{?with_debug}
%global _find_debuginfo_dwz_opts %{nil}
%global _dwz_low_mem_die_limit 0
%else
%global debug_package %{nil}
%endif
%if 0%{?rhel} > 7 && ! 0%{?fedora}
%define gobuild(o:) \
go build -buildmode pie -compiler gc -tags="rpm_crashtraceback libtrust_openssl ${BUILDTAGS:-}" -ldflags "${LDFLAGS:-} -linkmode=external -compressdwarf=false -B 0x$(head -c20 /dev/urandom|od -An -tx1|tr -d ' \\n') -extldflags '%__global_ldflags'" -a -v %{?**};
%else
%if ! 0%{?gobuild:1}
%define gobuild(o:) GO111MODULE=off go build -buildmode pie -compiler gc -tags="rpm_crashtraceback ${BUILDTAGS:-}" -ldflags "${LDFLAGS:-} -linkmode=external -B 0x$(head -c20 /dev/urandom|od -An -tx1|tr -d ' \\n') -extldflags '-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld '" -a -v %{?**};
%endif
%endif
go build -buildmode pie -compiler gc -tags="rpm_crashtraceback no_openssl ${BUILDTAGS:-}" -ldflags "${LDFLAGS:-} -compressdwarf=false -B 0x$(head -c20 /dev/urandom|od -An -tx1|tr -d ' \\n') -extldflags '%__global_ldflags'" -a -v -x %{?**};
%endif # distro
%global provider github
%global provider_tld com
%global project opencontainers
%global repo runc
# https://github.com/opencontainers/runc
%global import_path %{provider}.%{provider_tld}/%{project}/%{repo}
%global git0 https://%{import_path}
%global provider_prefix %{provider}.%{provider_tld}/%{project}/%{repo}
%global import_path %{provider_prefix}
%global git0 https://github.com/opencontainers/runc
%global commit0 2abd837c8c25b0102ac4ce14f17bc0bc7ddffba7
%global shortcommit0 %(c=%{commit0}; echo ${c:0:7})
Epoch: 1
Name: %{repo}
Version: 1.1.4
Release: 1%{?dist}
Version: 1.0.0
Release: 56.rc5.dev.git%{shortcommit0}%{?dist}
Summary: CLI for running Open Containers
# https://fedoraproject.org/wiki/PackagingDrafts/Go#Go_Language_Architectures
#ExclusiveArch: %%{go_arches}
# still use arch exclude as the macro above still refers %%{ix86} in RHEL8.4:
# https://bugzilla.redhat.com/show_bug.cgi?id=1905383
ExcludeArch: %{ix86}
License: ASL 2.0
URL: %{git0}
Source0: %{git0}/archive/v%{version}.tar.gz
Patch0: https://patch-diff.githubusercontent.com/raw/opencontainers/runc/pull/3468.patch
Provides: oci-runtime
BuildRequires: golang >= 1.17.7
BuildRequires: git
BuildRequires: /usr/bin/go-md2man
BuildRequires: libseccomp-devel >= 2.5
Requires: libseccomp >= 2.5
URL: http//%{provider_prefix}
Source0: %{git0}/archive/%{commit0}/%{repo}-%{shortcommit0}.tar.gz
Source1: 99-containers.conf
Patch0: change-default-root.patch
Patch1: 0001-Revert-Apply-cgroups-earlier.patch
Patch2: 1807.patch
Patch3: 0001-nsenter-clone-proc-self-exe-to-avoid-exposing-host-b-runc.patch
Patch4: pivot-root.patch
Requires: criu
Requires(pre): container-selinux >= 2:2.2-2
# If go_compiler is not set to 1, there is no virtual provide. Use golang instead.
BuildRequires: %{?go_compiler:compiler(go-compiler)}%{!?go_compiler:golang} >= 1.6.2
BuildRequires: git
BuildRequires: go-md2man
BuildRequires: libseccomp-devel
%description
The runc command can be used to start containers which are packaged
@ -48,7 +54,7 @@ in accordance with the Open Container Initiative's specifications,
and to manage containers running under runc.
%prep
%autosetup -Sgit
%autosetup -Sgit -n %{repo}-%{commit0}
sed -i '/\#\!\/bin\/bash/d' contrib/completions/bash/%{name}
%build
@ -59,11 +65,8 @@ pushd GOPATH
popd
pushd GOPATH/src/%{import_path}
export GO111MODULE=off
export GOPATH=%{gopath}:$(pwd)/GOPATH
export CGO_CFLAGS="%{optflags} -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64"
export BUILDTAGS="selinux seccomp"
export LDFLAGS="-X main.gitCommit= -X main.version=%{version}"
%gobuild -o %{name} %{import_path}
pushd man
@ -71,7 +74,15 @@ pushd man
popd
%install
make install install-man install-bash DESTDIR=$RPM_BUILD_ROOT PREFIX=%{_prefix} LIBDIR=%{_libdir} BINDIR=%{_bindir}
install -d -p %{buildroot}%{_bindir}
install -p -m 755 %{name} %{buildroot}%{_bindir}
# install man pages
install -d -p %{buildroot}%{_mandir}/man8
install -p -m 644 man/man8/* %{buildroot}%{_mandir}/man8
# install bash completion
install -d -p %{buildroot}%{_datadir}/bash-completion/completions
install -p -m 0644 contrib/completions/bash/%{name} %{buildroot}%{_datadir}/bash-completion/completions
%check
@ -86,147 +97,12 @@ make install install-man install-bash DESTDIR=$RPM_BUILD_ROOT PREFIX=%{_prefix}
%{_datadir}/bash-completion/completions/%{name}
%changelog
* Mon Aug 29 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.1.4-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.1.4
- Related: #2061390
* Mon Jun 13 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.1.3-2
- update to https://github.com/opencontainers/runc/releases/tag/v1.1.3
- Related: #2061390
* Thu Jun 09 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.1.3-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.1.3
- Related: #2061390
* Fri Jun 03 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.1.2-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.1.2
- Related: #2061390
* Fri Apr 08 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.0.3-3
- bump golang BR to 1.17.7
- Related: #2061390
* Fri Mar 11 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.0.3-2
- require at least libseccomp >= 2.5
- Resolves: #2053990
- Related: #2061390
* Fri Feb 18 2022 Jindrich Novy <jnovy@redhat.com> - 1:1.0.3-1
- rollback to 1.0.3 due to gating test issues
- Related: #2001445
* Tue Jan 18 2022 Jindrich Novy <jnovy@redhat.com> - 1.1.0-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.1.0
- Related: #2001445
* Mon Dec 06 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.3-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.3
- Related: #2001445
* Wed Aug 25 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.2-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.2
- Related: #1934415
* Fri Aug 06 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.1-5
- do not use versioned provide
- Related: #1934415
* Thu Jul 29 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.1-4
- fix "unknown version" displayed by runc -v
- Related: #1934415
* Mon Jul 26 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.1-3
- be sure to compile runc binaries the right way
- Related: #1934415
* Mon Jul 26 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.1-2
- use Makefile
- Related: #1934415
* Wed Jul 21 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.1-1
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.1
- Related: #1934415
* Thu May 20 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-76.rc95
- updated to rc95 to fix CVE-2021-30465
- Related: #1934415
* Tue May 18 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-75.rc94
- set GO111MODULE=off to fix build
- Related: #1934415
* Fri May 14 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-74.rc94
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.0-rc94
- Related: #1934415
* Tue May 11 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-73.rc93
- fix CVE-2021-30465
- Related: #1934415
* Tue Mar 30 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-72.rc93
- upload rc93 tarball
- Related: #1934415
* Tue Mar 30 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-71.rc93
- update to rc93
- Related: #1934415
* Fri Jan 29 2021 Jindrich Novy <jnovy@redhat.com> - 1.0.0-70.rc92
- add missing Provides: oci-runtime = 1
- Related: #1883490
* Tue Dec 08 2020 Jindrich Novy <jnovy@redhat.com> - 1.0.0-69.rc92
- still use ExcludeArch as go_arches macro is broken for 8.4
- Related: #1883490
* Tue Aug 11 2020 Jindrich Novy <jnovy@redhat.com> - 1.0.0-68.rc92
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.0-rc92
- propagate proper CFLAGS to CGO_CFLAGS to assure code hardening and optimization
- Related: #1821193
* Thu Jul 02 2020 Jindrich Novy <jnovy@redhat.com> - 1.0.0-67.rc91
- update to https://github.com/opencontainers/runc/releases/tag/v1.0.0-rc91
- Related: #1821193
* Tue May 12 2020 Jindrich Novy <jnovy@redhat.com> - 1.0.0-66.rc10
- synchronize containter-tools 8.3.0 with 8.2.1
- Related: #1821193
* Wed Feb 12 2020 Jindrich Novy <jnovy@redhat.com> - 1.0.0-65.rc10
- address CVE-2019-19921 by updating to rc10
- Resolves: #1801887
* Wed Dec 11 2019 Jindrich Novy <jnovy@redhat.com> - 1.0.0-64.rc9
- use no_openssl in BUILDTAGS (no vendored crypto in runc)
- Related: RHELPLAN-25139
* Mon Dec 09 2019 Jindrich Novy <jnovy@redhat.com> - 1.0.0-63.rc9
- be sure to use golang >= 1.12.12-4
- Related: RHELPLAN-25139
* Thu Nov 28 2019 Jindrich Novy <jnovy@redhat.com> - 1.0.0-62.rc9
* Thu Nov 28 2019 Jindrich Novy <jnovy@redhat.com> - 1.0.0-56.rc5.dev.git2abd837
- rebuild because of CVE-2019-9512 and CVE-2019-9514
- Resolves: #1766331, #1766303
* Thu Nov 21 2019 Jindrich Novy <jnovy@redhat.com> - 1.0.0-61.rc9
- update to runc 1.0.0-rc9 release
- amend golang deps
- fixes CVE-2019-16884
- Resolves: #1759651
* Mon Jun 17 2019 Lokesh Mandvekar <lsm5@redhat.com> - 1.0.0-60.rc8
- Resolves: #1721247 - enable fips mode
* Mon Jun 17 2019 Lokesh Mandvekar <lsm5@redhat.com> - 1.0.0-59.rc8
- Resolves: #1720654 - rebase to v1.0.0-rc8
* Thu Apr 11 2019 Eduardo Santiago <santiago@redhat.com> - 1.0.0-57.rc5.dev.git2abd837
- Resolves: #1693424 - podman rootless: cannot specify gid= mount options
* Wed Feb 27 2019 Lokesh Mandvekar <lsm5@redhat.com> - 1.0.0-56.rc5.dev.git2abd837
- change-default-root patch not needed as there's no docker on rhel8
- Resolves: #1766328, #1766300
* Tue Feb 12 2019 Lokesh Mandvekar <lsm5@redhat.com> - 1.0.0-55.rc5.dev.git2abd837
- Resolves: #1665770 - rootfs: umount all procfs and sysfs with --no-pivot
- Resolves: CVE-2019-5736
* Tue Dec 18 2018 Frantisek Kluknavsky <fkluknav@redhat.com> - 1.0.0-54.rc5.dev.git2abd837