Adds low-level support for launching Linux containers with cgroup namespaces.
* gnu/build/linux-container.scm (%namespaces): Add 'cgroup.
(namespaces->bit-mask): Handle it.
* guix/build/syscalls.scm (CLONE_NEWCGROUP): New variable.
Signed-off-by: Ludovic Courtès <ludo@gnu.org>
This broke 'guix environment --container' on non-Debian distributions.
Fixes <https://bugs.gnu.org/45066>. Reported by luhux <luhux@outlook.com>.
This reverts commit 8bc5ca5160.
Fixes <https://bugs.gnu.org/31977>.
Reported by Paul Garlick <pgarlick@tourbillion-technology.com>.
* gnu/build/linux-container.scm (unprivileged-user-namespace-supported?):
Return #f when the 'userns-file' does not exist.
This is a follow-up of 5316dfc0f1. Some users of
run-container may expect that the container is jailed, even if there are no
mounts. This is the case for some Guix tests.
* gnu/build/linux-container.scm (run-container): Do not jail the container
when the requested root is "/".
We may want to run a container inside the MNT namespace, without jailing the
container. If RUN-CONTAINER is passed a null MOUNTS list, do not jail the
container.
* gnu/build/linux-container.scm (run-container): Do not call
MOUNT-FILE-SYSTEMS if MOUNTS list is empty.
* gnu/build/linux-container.scm (call-with-container): Add
#:process-spawned-hook and honor it.
* gnu/system/linux-container.scm (container-script)[script]:
Define 'explain' and pass it as #:process-spawned-hook'.
Fixes <https://bugs.gnu.org/36463>.
Reported by Steffen Rytter Postas <nc@scalehost.eu>.
* gnu/build/linux-container.scm (mount-file-systems): When /dev/ptmx
exists on the host, explicitly mount a new instance of devpts and make
/dev/ptmx a symlink to /dev/pts/ptmx.
Fixes a bug whereby derivations importing (gnu build linux-container),
such as the 'bitlbee' and 'tor' services, would depend on the
user's (guix config) file, which was pulled as a dependency of (guix
utils). As a result, those derivations would vary from user to user.
* gnu/build/linux-container.scm (call-with-temporary-directory): New
procedure.
Typically 'read-pid-file/container' would fail when starting services in
containers such as BitlBee.
* gnu/build/linux-container.scm (call-with-clean-exit): Use
'primitive-_exit' instead of 'primitive-exit'.
(container-excursion*): Close OUT.
* gnu/build/file-systems.scm (mount-file-system): Rename 'spec' to 'fs'
and assume it's a <file-system>.
* gnu/build/linux-boot.scm (boot-system): Assume MOUNTS is a list of
<file-system> and adjust accordingly.
* gnu/build/linux-container.scm (mount-file-systems): Remove
'file-system->spec' call.
* gnu/services/base.scm (file-system-shepherd-service): Add
'spec->file-system' call. Add (gnu system file-systems) to 'modules'.
* gnu/system/linux-initrd.scm (raw-initrd): Use (gnu system
file-systems). Add 'spec->file-system' call for #:mounts.
* gnu/build/linux-container.scm (container-excursion*): New procedure.
* tests/containers.scm ("container-excursion*")
("container-excursion*, same namespaces"): New tests.
This avoids problems where 'isatty?' return #t but 'ttyname' fails with
ENOTTY or such.
* gnu/build/linux-container.scm (mount-file-systems): Remove call of
'isatty?'. Directly call 'ttyname' and catch 'system-error'.
* gnu/build/linux-container.scm (mount-file-systems): 'mounts' is now a
list of <file-system> objects instead of a list of lists ("specs").
Add call to 'file-system->spec' as the argument to 'mount-file-system'.
(run-container, call-with-container): Adjust docstring accordingly.
* gnu/system/file-systems.scm (spec->file-system): New procedure.
* gnu/system/linux-container.scm (container-script)[script]: Call
'spec->file-system' inside gexp.
* guix/scripts/environment.scm (launch-environment/container): Remove
call to 'file-system->spec'.
* tests/containers.scm ("call-with-container, mnt namespace")
("call-with-container, mnt namespace, wrong bind mount"): Pass a list of
<file-system> objects.
Before that, 'container-excursion' would call 'setns' even when the
target namespace is the one the caller is already in, which would fail.
* gnu/build/linux-container.scm (container-excursion): Introduce
'source' and 'target'. Compare the result of 'readlink' on these
instead of comparing file descriptors to decide whether to call
'setns'.
* tests/containers.scm ("container-excursion, same namespace"): New test.
Fixes <http://bugs.gnu.org/23306>.
* gnu/build/linux-container.scm (run-container): Use 'socketpair'
instead of 'pipe'. Rename 'in' to 'child' and 'out' to 'parent'. Send
a 'ready message or an exception argument list from the child to the
parent; adjust the parent accordingly.
* tests/containers.scm ("call-with-container, mnt namespace, wrong bind
mount"): New test.
* tests/guix-environment-container.sh: Add test with
--expose=/does-not-exist.
* gnu/build/linux-container.scm (unprivileged-user-namespace-supported?): Only
read and check the first character, to cope with a possible newline in the
(pseudo-)file.
* gnu/build/linux-container.scm (namespaces->bit-mask): Remove
CLONE_CHILD_CLEARTID and CLONE_CHILD_SETTID, which are unneeded.
Discussed at <http://bugs.gnu.org/21694>.
Before, call-with-clean-exit would *always* return an exit code of 1.
* gnu/build/linux-container.scm (call-with-clean-exit): Exit with status
code of 0 if thunk does not throw an exception.
* tests/containers.scm: Add test.
The intent is to make 'clone' behave a lot more like 'primitive-fork', which
calls clone(2) with SIGCHLD, CLONE_CHILD_CLEARTID, and CLONE_CHILD_SETTID
flags. Notably, running 'clone' at the REPL without these flags would break
the REPL beyond repair.
* guix/build/syscalls.scm (CLONE_CHILD_CLEARTID, CLONE_CHILD_SETTID): New
variables.
* gnu/build/linux-container.scm (namespaces->bit-mask): Add
CLONE_CHILD_CLEARTID and CLONE_CHILD_SETTID to bit mask.
It's not always possible to map 65536 uids when creating a container as the
root user within another user namespace. This is true when building Guix
within the build daemon's container. By using a uid range of 1 by default,
even as the root user, the tests now pass.
* gnu/build/linux-container.scm (initialize-user-namespace, run-container):
Add 'host-uids' argument.
(call-with-container): Add #:host-uids keyword argument.
* tests/containers.scm ("container-excursion"): Update 'run-container' call.
* gnu/build/linux-container.scm: New file.
* gnu-system.am (GNU_SYSTEM_MODULES): Add it.
* .dir-locals.el: Add Scheme indent rules for 'call-with-container', and
'container-excursion'.
* tests/containers.scm: New file.
* Makefile.am (SCM_TESTS): Add it.