File 0001-Reexec-using-proc-self-exe.patch of Package podman-for-qemu-user
From 65aa10877cea40990944e4adaffcf6cf1c0fbfbb Mon Sep 17 00:00:00 2001
From: Fabian Vogt <fvogt@suse.de>
Date: Fri, 23 Aug 2024 11:28:42 +0200
Subject: [PATCH] Reexec using /proc/self/exe
In OBS, builds for RISC-V use QEMU userspace emulation. QEMU doesn't
implement some kernel features needed by container runtimes, so it's
necessary to skip most of them by passing --isolation=chroot (to not
use OCI runtimes for building) and --security-opt=seccomp=unconfined
(to disable the use of seccomp for syscall filtering). Those are not
sufficent on their own however:
When using podman's builtin "chroot" isolation method for building,
it uses a double execve of itself to avoid CVE-2019-5736:
The first copies /proc/self/exe into a memfd created with MFD_CLOEXEC,
which is subsequently run with execveat. This is incompatible with
binfmt_misc because at the time the interpreter runs, the FD was
closed (documented in execveat(2)).
Hacking around that by removing CLOEXEC is not enough, because then
the second execve fails: The process that was just executed has
/dev/fd/5 as executable, but a pipe was passed that gets mapped to
FD 5 using dup3(17, 5, 0). When QEMU get the call to execve
/proc/self/exe, it converts it to /dev/fd/5, which is meanwhile a pipe
and not executable, resulting in -EACCESS.
Hacking around that by making sure the executed memfd has a FD number
bigger than 5 works: fd = fcntl(fd, F_DUPFD, 5); fcntl(fd, F_SETFD, 0);
However, this is quite dirty as it's a visible file descriptor leak and
still relies on the application to keep the FD open.
Instead, just disable the isolation by just execve'ing /proc/self/exe
in both parts, bypassing the issue with MFD_CLOEXEC and the closed FD
in one go, in the single place where it's needed. In OBS it's running
in a VM anyway, so isolation isn't needed for security reasons. An
additional unsatisfiable package dependency makes sure that it's not
ever installed outside of OBS by accident.
---
.../github.com/containers/storage/pkg/unshare/unshare.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/vendor/github.com/containers/storage/pkg/unshare/unshare.c b/vendor/github.com/containers/storage/pkg/unshare/unshare.c
index a2800654f9ea..0951a5f99654 100644
--- a/vendor/github.com/containers/storage/pkg/unshare/unshare.c
+++ b/vendor/github.com/containers/storage/pkg/unshare/unshare.c
@@ -281,14 +281,7 @@ static int containers_reexec(int flags) {
return -1;
}
- if (flags & CLONE_NEWNS)
- fd = try_bindfd();
- if (fd < 0)
- fd = copy_self_proc_exe(argv);
- if (fd < 0)
- return fd;
-
- if (fexecve(fd, argv, environ) == -1) {
+ if (execve("/proc/self/exe", argv, environ) == -1) {
close(fd);
fprintf(stderr, "Error during reexec(...): %m\n");
return -1;
--
2.45.2