doc: cookbook: Add "Installing Guix on a Cluster" chapter.
This is derived from the article at <https://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster/>, with clarifications and updates. * doc/guix-cookbook.texi (Installing Guix on a Cluster): New chapter.master
parent
8b314efd50
commit
47c1de22df
|
@ -21,7 +21,8 @@ Copyright @copyright{} 2020 Brice Waegeneire@*
|
|||
Copyright @copyright{} 2020 André Batista@*
|
||||
Copyright @copyright{} 2020 Christine Lemmer-Webber@*
|
||||
Copyright @copyright{} 2021 Joshua Branson@*
|
||||
Copyright @copyright{} 2022 Maxim Cournoyer*
|
||||
Copyright @copyright{} 2022 Maxim Cournoyer@*
|
||||
Copyright @copyright{} 2023 Ludovic Courtès
|
||||
|
||||
Permission is granted to copy, distribute and/or modify this document
|
||||
under the terms of the GNU Free Documentation License, Version 1.3 or
|
||||
|
@ -73,8 +74,9 @@ Weblate} (@pxref{Translating Guix,,, guix, GNU Guix reference manual}).
|
|||
* Packaging:: Packaging tutorials
|
||||
* System Configuration:: Customizing the GNU System
|
||||
* Containers:: Isolated environments and nested systems
|
||||
* Advanced package management:: Power to the users!
|
||||
* Advanced package management:: Power to the users!
|
||||
* Environment management:: Control environment
|
||||
* Installing Guix on a Cluster:: High-performance computing.
|
||||
|
||||
* Acknowledgments:: Thanks!
|
||||
* GNU Free Documentation License:: The license of this document.
|
||||
|
@ -83,28 +85,45 @@ Weblate} (@pxref{Translating Guix,,, guix, GNU Guix reference manual}).
|
|||
@detailmenu
|
||||
--- The Detailed Node Listing ---
|
||||
|
||||
Scheme tutorials
|
||||
|
||||
* A Scheme Crash Course:: Learn the basics of Scheme
|
||||
|
||||
Packaging
|
||||
|
||||
* Packaging Tutorial:: Let's add a package to Guix!
|
||||
* Packaging Tutorial:: A tutorial on how to add packages to Guix.
|
||||
|
||||
System Configuration
|
||||
|
||||
* Auto-Login to a Specific TTY:: Automatically Login a User to a Specific TTY
|
||||
* Customizing the Kernel:: Creating and using a custom Linux kernel on Guix System.
|
||||
* Guix System Image API:: Customizing images to target specific platforms.
|
||||
* Using security keys:: How to use security keys with Guix System.
|
||||
* Connecting to Wireguard VPN:: Connecting to a Wireguard VPN.
|
||||
* Customizing a Window Manager:: Handle customization of a Window manager on Guix System.
|
||||
* Running Guix on a Linode Server:: Running Guix on a Linode Server. Running Guix on a Linode Server
|
||||
* Setting up a bind mount:: Setting up a bind mount in the file-systems definition.
|
||||
* Getting substitutes from Tor:: Configuring Guix daemon to get substitutes through Tor.
|
||||
* Setting up NGINX with Lua:: Configuring NGINX web-server to load Lua modules.
|
||||
* Auto-Login to a Specific TTY:: Automatically Login a User to a Specific TTY
|
||||
* Customizing the Kernel:: Creating and using a custom Linux kernel on Guix System.
|
||||
* Guix System Image API:: Customizing images to target specific platforms.
|
||||
* Using security keys:: How to use security keys with Guix System.
|
||||
* Connecting to Wireguard VPN:: Connecting to a Wireguard VPN.
|
||||
* Customizing a Window Manager:: Handle customization of a Window manager on Guix System.
|
||||
* Running Guix on a Linode Server:: Running Guix on a Linode Server
|
||||
* Setting up a bind mount:: Setting up a bind mount in the file-systems definition.
|
||||
* Getting substitutes from Tor:: Configuring Guix daemon to get substitutes through Tor.
|
||||
* Setting up NGINX with Lua:: Configuring NGINX web-server to load Lua modules.
|
||||
* Music Server with Bluetooth Audio:: Headless music player with Bluetooth output.
|
||||
|
||||
Containers
|
||||
|
||||
* Guix Containers:: Perfectly isolated environments
|
||||
* Guix System Containers:: A system inside your system
|
||||
|
||||
Advanced package management
|
||||
|
||||
* Guix Profiles in Practice:: Strategies for multiple profiles and manifests.
|
||||
|
||||
Environment management
|
||||
|
||||
* Guix environment via direnv:: Setup Guix environment with direnv
|
||||
|
||||
Installing Guix on a Cluster
|
||||
|
||||
* Setting Up a Head Node:: The node that runs the daemon.
|
||||
* Setting Up Compute Nodes:: Client nodes.
|
||||
* Cluster Network Access:: Dealing with network access restrictions.
|
||||
* Cluster Disk Usage:: Disk usage considerations.
|
||||
* Cluster Security Considerations:: Keeping the cluster secure.
|
||||
|
||||
@end detailmenu
|
||||
@end menu
|
||||
|
||||
|
@ -3635,6 +3654,380 @@ will have predefined environment variables and procedures.
|
|||
|
||||
Run @command{direnv allow} to setup the environment for the first time.
|
||||
|
||||
|
||||
@c *********************************************************************
|
||||
@node Installing Guix on a Cluster
|
||||
@chapter Installing Guix on a Cluster
|
||||
|
||||
@cindex cluster installation
|
||||
@cindex high-performance computing, HPC
|
||||
@cindex HPC, high-performance computing
|
||||
Guix is appealing to scientists and @acronym{HPC, high-performance
|
||||
computing} practitioners: it makes it easy to deploy potentially complex
|
||||
software stacks, and it lets you do so in a reproducible fashion---you
|
||||
can redeploy the exact same software on different machines and at
|
||||
different points in time.
|
||||
|
||||
In this chapter we look at how a cluster sysadmin can install Guix for
|
||||
system-wide use, such that it can be used on all the cluster nodes, and
|
||||
discuss the various tradeoffs@footnote{This chapter is adapted from a
|
||||
@uref{https://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster/,
|
||||
blog post published on the Guix-HPC web site in 2017}.}.
|
||||
|
||||
@quotation Note
|
||||
Here we assume that the cluster is running a GNU/Linux distro other than
|
||||
Guix System and that we are going to install Guix on top of it.
|
||||
@end quotation
|
||||
|
||||
@menu
|
||||
* Setting Up a Head Node:: The node that runs the daemon.
|
||||
* Setting Up Compute Nodes:: Client nodes.
|
||||
* Cluster Network Access:: Dealing with network access restrictions.
|
||||
* Cluster Disk Usage:: Disk usage considerations.
|
||||
* Cluster Security Considerations:: Keeping the cluster secure.
|
||||
@end menu
|
||||
|
||||
@node Setting Up a Head Node
|
||||
@section Setting Up a Head Node
|
||||
|
||||
The recommended approach is to set up one @emph{head node} running
|
||||
@command{guix-daemon} and exporting @file{/gnu/store} over NFS to
|
||||
compute nodes.
|
||||
|
||||
Remember that @command{guix-daemon} is responsible for spawning build
|
||||
processes and downloads on behalf of clients (@pxref{Invoking
|
||||
guix-daemon,,, guix, GNU Guix Reference Manual}), and more generally
|
||||
accessing @file{/gnu/store}, which contains all the package binaries
|
||||
built by all the users (@pxref{The Store,,, guix, GNU Guix Reference
|
||||
Manual}). ``Client'' here refers to all the Guix commands that users
|
||||
see, such as @code{guix install}. On a cluster, these commands may be
|
||||
running on the compute nodes and we'll want them to talk to the head
|
||||
node's @code{guix-daemon} instance.
|
||||
|
||||
To begin with, the head node can be installed following the usual binary
|
||||
installation instructions (@pxref{Binary Installation,,, guix, GNU Guix
|
||||
Reference Manual}). Thanks to the installation script, this should be
|
||||
quick. Once installation is complete, we need to make some adjustments.
|
||||
|
||||
Since we want @code{guix-daemon} to be reachable not just from the head
|
||||
node but also from the compute nodes, we need to arrange so that it
|
||||
listens for connections over TCP/IP. To do that, we'll edit the systemd
|
||||
startup file for @command{guix-daemon},
|
||||
@file{/etc/systemd/system/guix-daemon.service}, and add a
|
||||
@code{--listen} argument to the @code{ExecStart} line so that it looks
|
||||
something like this:
|
||||
|
||||
@example
|
||||
ExecStart=/var/guix/profiles/per-user/root/current-guix/bin/guix-daemon --build-users-group=guixbuild --listen=/var/guix/daemon-socket/socket --listen=0.0.0.0
|
||||
@end example
|
||||
|
||||
For these changes to take effect, the service needs to be restarted:
|
||||
|
||||
@example
|
||||
systemctl daemon-reload
|
||||
systemctl restart guix-daemon
|
||||
@end example
|
||||
|
||||
@quotation Note
|
||||
The @code{--listen=0.0.0.0} bit means that @code{guix-daemon} will
|
||||
process @emph{all} incoming TCP connections on port 44146
|
||||
(@pxref{Invoking guix-daemon,,, guix, GNU Guix Reference Manual}). This
|
||||
is usually fine in a cluster setup where the head node is reachable
|
||||
exclusively from the cluster's local area network---you don't want that
|
||||
to be exposed to the Internet!
|
||||
@end quotation
|
||||
|
||||
The next step is to define our NFS exports in
|
||||
@uref{https://linux.die.net/man/5/exports,@file{/etc/exports}} by adding
|
||||
something along these lines:
|
||||
|
||||
@example
|
||||
/gnu/store *(ro)
|
||||
/var/guix *(rw, async)
|
||||
/var/log/guix *(ro)
|
||||
@end example
|
||||
|
||||
The @file{/gnu/store} directory can be exported read-only since only
|
||||
@command{guix-daemon} on the master node will ever modify it.
|
||||
@file{/var/guix} contains @emph{user profiles} as managed by @code{guix
|
||||
package}; thus, to allow users to install packages with @code{guix
|
||||
package}, this must be read-write.
|
||||
|
||||
Users can create as many profiles as they like in addition to the
|
||||
default profile, @file{~/.guix-profile}. For instance, @code{guix
|
||||
package -p ~/dev/python-dev -i python} installs Python in a profile
|
||||
reachable from the @code{~/dev/python-dev} symlink. To make sure that
|
||||
this profile is protected from garbage collection---i.e., that Python
|
||||
will not be removed from @file{/gnu/store} while this profile exists---,
|
||||
@emph{home directories should be mounted on the head node} as well so
|
||||
that @code{guix-daemon} knows about these non-standard profiles and
|
||||
avoids collecting software they refer to.
|
||||
|
||||
It may be a good idea to periodically remove unused bits from
|
||||
@file{/gnu/store} by running @command{guix gc} (@pxref{Invoking guix
|
||||
gc,,, guix, GNU Guix Reference Manual}). This can be done by adding a
|
||||
crontab entry on the head node:
|
||||
|
||||
@example
|
||||
root@@master# crontab -e
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
... with something like this:
|
||||
|
||||
@example
|
||||
# Every day at 5AM, run the garbage collector to make sure
|
||||
# at least 10 GB are free on /gnu/store.
|
||||
0 5 * * 1 /usr/local/bin/guix gc -F10G
|
||||
@end example
|
||||
|
||||
We're done with the head node! Let's look at compute nodes now.
|
||||
|
||||
@node Setting Up Compute Nodes
|
||||
@section Setting Up Compute Nodes
|
||||
|
||||
First of all, we need compute nodes to mount those NFS directories that
|
||||
the head node exports. This can be done by adding the following lines
|
||||
to @uref{https://linux.die.net/man/5/fstab,@file{/etc/fstab}}:
|
||||
|
||||
@example
|
||||
@var{head-node}:/gnu/store /gnu/store nfs defaults,_netdev,vers=3 0 0
|
||||
@var{head-node}:/var/guix /var/guix nfs defaults,_netdev,vers=3 0 0
|
||||
@var{head-node}:/var/log/guix /var/log/guix nfs defaults,_netdev,vers=3 0 0
|
||||
@end example
|
||||
|
||||
@noindent
|
||||
... where @var{head-node} is the name or IP address of your head node.
|
||||
From there on, assuming the mount points exist, you should be able to
|
||||
mount each of these on the compute nodes.
|
||||
|
||||
Next, we need to provide a default @command{guix} command that users can
|
||||
run when they first connect to the cluster (eventually they will invoke
|
||||
@command{guix pull}, which will provide them with their ``own''
|
||||
@command{guix} command). Similar to what the binary installation script
|
||||
did on the head node, we'll store that in @file{/usr/local/bin}:
|
||||
|
||||
@example
|
||||
mkdir -p /usr/local/bin
|
||||
ln -s /var/guix/profiles/per-user/root/current-guix/bin/guix \
|
||||
/usr/local/bin/guix
|
||||
@end example
|
||||
|
||||
We then need to tell @code{guix} to talk to the daemon running on our
|
||||
master node, by adding these lines to @code{/etc/profile}:
|
||||
|
||||
@example
|
||||
GUIX_DAEMON_SOCKET="guix://@var{head-node}"
|
||||
export GUIX_DAEMON_SOCKET
|
||||
@end example
|
||||
|
||||
To avoid warnings and make sure @code{guix} uses the right locale, we
|
||||
need to tell it to use locale data provided by Guix (@pxref{Application
|
||||
Setup,,, guix, GNU Guix Reference Manual}):
|
||||
|
||||
@example
|
||||
GUIX_LOCPATH=/var/guix/profiles/per-user/root/guix-profile/lib/locale
|
||||
export GUIX_LOCPATH
|
||||
|
||||
# Here we must use a valid locale name. Try "ls $GUIX_LOCPATH/*"
|
||||
# to see what names can be used.
|
||||
LC_ALL=fr_FR.utf8
|
||||
export LC_ALL
|
||||
@end example
|
||||
|
||||
For convenience, @code{guix package} automatically generates
|
||||
@file{~/.guix-profile/etc/profile}, which defines all the environment
|
||||
variables necessary to use the packages---@code{PATH},
|
||||
@code{C_INCLUDE_PATH}, @code{PYTHONPATH}, etc. Thus it's a good idea to
|
||||
source it from @code{/etc/profile}:
|
||||
|
||||
@example
|
||||
GUIX_PROFILE="$HOME/.guix-profile"
|
||||
if [ -f "$GUIX_PROFILE/etc/profile" ]; then
|
||||
. "$GUIX_PROFILE/etc/profile"
|
||||
fi
|
||||
@end example
|
||||
|
||||
Last but not least, Guix provides command-line completion notably for
|
||||
Bash and zsh. In @code{/etc/bashrc}, consider adding this line:
|
||||
|
||||
@verbatim
|
||||
. /var/guix/profiles/per-user/root/current-guix/etc/bash_completion.d/guix
|
||||
@end verbatim
|
||||
|
||||
Voilà!
|
||||
|
||||
You can check that everything's in place by logging in on a compute node
|
||||
and running:
|
||||
|
||||
@example
|
||||
guix install hello
|
||||
@end example
|
||||
|
||||
The daemon on the head node should download pre-built binaries on your
|
||||
behalf and unpack them in @file{/gnu/store}, and @command{guix install}
|
||||
should create @file{~/.guix-profile} containing the
|
||||
@file{~/.guix-profile/bin/hello} command.
|
||||
|
||||
@node Cluster Network Access
|
||||
@section Network Access
|
||||
|
||||
Guix requires network access to download source code and pre-built
|
||||
binaries. The good news is that only the head node needs that since
|
||||
compute nodes simply delegate to it.
|
||||
|
||||
It is customary for cluster nodes to have access at best to a
|
||||
@emph{white list} of hosts. Our head node needs at least
|
||||
@code{ci.guix.gnu.org} in this white list since this is where it gets
|
||||
pre-built binaries from by default, for all the packages that are in
|
||||
Guix proper.
|
||||
|
||||
Incidentally, @code{ci.guix.gnu.org} also serves as a
|
||||
@emph{content-addressed mirror} of the source code of those packages.
|
||||
Consequently, it is sufficient to have @emph{only}
|
||||
@code{ci.guix.gnu.org} in that white list.
|
||||
|
||||
Software packages maintained in a separate repository such as one of the
|
||||
various @uref{https://hpc.guix.info/channels, HPC channels} are of
|
||||
course unavailable from @code{ci.guix.gnu.org}. For these packages, you
|
||||
may want to extend the white list such that source and pre-built
|
||||
binaries (assuming this-party servers provide binaries for these
|
||||
packages) can be downloaded. As a last resort, users can always
|
||||
download source on their workstation and add it to the cluster's
|
||||
@file{/gnu/store}, like this:
|
||||
|
||||
@verbatim
|
||||
GUIX_DAEMON_SOCKET=ssh://compute-node.example.org \
|
||||
guix download http://starpu.gforge.inria.fr/files/starpu-1.2.3/starpu-1.2.3.tar.gz
|
||||
@end verbatim
|
||||
|
||||
The above command downloads @code{starpu-1.2.3.tar.gz} @emph{and} sends
|
||||
it to the cluster's @code{guix-daemon} instance over SSH.
|
||||
|
||||
Air-gapped clusters require more work. At the moment, our suggestion
|
||||
would be to download all the necessary source code on a workstation
|
||||
running Guix. For instance, using the @option{--sources} option of
|
||||
@command{guix build} (@pxref{Invoking guix build,,, guix, GNU Guix
|
||||
Reference Manual}), the example below downloads all the source code the
|
||||
@code{openmpi} package depends on:
|
||||
|
||||
@example
|
||||
$ guix build --sources=transitive openmpi
|
||||
|
||||
@dots{}
|
||||
|
||||
/gnu/store/xc17sm60fb8nxadc4qy0c7rqph499z8s-openmpi-1.10.7.tar.bz2
|
||||
/gnu/store/s67jx92lpipy2nfj5cz818xv430n4b7w-gcc-5.4.0.tar.xz
|
||||
/gnu/store/npw9qh8a46lrxiwh9xwk0wpi3jlzmjnh-gmp-6.0.0a.tar.xz
|
||||
/gnu/store/hcz0f4wkdbsvsdky3c0vdvcawhdkyldb-mpfr-3.1.5.tar.xz
|
||||
/gnu/store/y9akh452n3p4w2v631nj0injx7y0d68x-mpc-1.0.3.tar.gz
|
||||
/gnu/store/6g5c35q8avfnzs3v14dzl54cmrvddjm2-glibc-2.25.tar.xz
|
||||
/gnu/store/p9k48dk3dvvk7gads7fk30xc2pxsd66z-hwloc-1.11.8.tar.bz2
|
||||
/gnu/store/cry9lqidwfrfmgl0x389cs3syr15p13q-gcc-5.4.0.tar.xz
|
||||
/gnu/store/7ak0v3rzpqm2c5q1mp3v7cj0rxz0qakf-libfabric-1.4.1.tar.bz2
|
||||
/gnu/store/vh8syjrsilnbfcf582qhmvpg1v3rampf-rdma-core-14.tar.gz
|
||||
…
|
||||
@end example
|
||||
|
||||
(In case you're wondering, that's more than 320@ MiB of
|
||||
@emph{compressed} source code.)
|
||||
|
||||
We can then make a big archive containing all of this (@pxref{Invoking
|
||||
guix archive,,, guix, GNU Guix Reference Manual}):
|
||||
|
||||
@verbatim
|
||||
$ guix archive --export \
|
||||
`guix build --sources=transitive openmpi` \
|
||||
> openmpi-source-code.nar
|
||||
@end verbatim
|
||||
|
||||
@dots{} and we can eventually transfer that archive to the cluster on
|
||||
removable storage and unpack it there:
|
||||
|
||||
@verbatim
|
||||
$ guix archive --import < openmpi-source-code.nar
|
||||
@end verbatim
|
||||
|
||||
This process has to be repeated every time new source code needs to be
|
||||
brought to the cluster.
|
||||
|
||||
As we write this, the research institutes involved in Guix-HPC do not
|
||||
have air-gapped clusters though. If you have experience with such
|
||||
setups, we would like to hear feedback and suggestions.
|
||||
|
||||
@node Cluster Disk Usage
|
||||
@section Disk Usage
|
||||
|
||||
@cindex disk usage, on a cluster
|
||||
A common concern of sysadmins' is whether this is all going to eat a lot
|
||||
of disk space. If anything, if something is going to exhaust disk
|
||||
space, it's going to be scientific data sets rather than compiled
|
||||
software---that's our experience with almost ten years of Guix usage on
|
||||
HPC clusters. Nevertheless, it's worth taking a look at how Guix
|
||||
contributes to disk usage.
|
||||
|
||||
First, having several versions or variants of a given package in
|
||||
@file{/gnu/store} does not necessarily cost much, because
|
||||
@command{guix-daemon} implements deduplication of identical files, and
|
||||
package variants are likely to have a number of common files.
|
||||
|
||||
As mentioned above, we recommend having a cron job to run @code{guix gc}
|
||||
periodically, which removes @emph{unused} software from
|
||||
@file{/gnu/store}. However, there's always a possibility that users will
|
||||
keep lots of software in their profiles, or lots of old generations of
|
||||
their profiles, which is ``live'' and cannot be deleted from the
|
||||
viewpoint of @command{guix gc}.
|
||||
|
||||
The solution to this is for users to regularly remove old generations of
|
||||
their profile. For instance, the following command removes generations
|
||||
that are more than two-month old:
|
||||
|
||||
@example
|
||||
guix package --delete-generations=2m
|
||||
@end example
|
||||
|
||||
Likewise, it's a good idea to invite users to regularly upgrade their
|
||||
profile, which can reduce the number of variants of a given piece of
|
||||
software stored in @file{/gnu/store}:
|
||||
|
||||
@example
|
||||
guix pull
|
||||
guix upgrade
|
||||
@end example
|
||||
|
||||
As a last resort, it is always possible for sysadmins to do some of this
|
||||
on behalf of their users. Nevertheless, one of the strengths of Guix is
|
||||
the freedom and control users get on their software environment, so we
|
||||
strongly recommend leaving users in control.
|
||||
|
||||
@node Cluster Security Considerations
|
||||
@section Security Considerations
|
||||
|
||||
@cindex security, on a cluster
|
||||
On an HPC cluster, Guix is typically used to manage scientific software.
|
||||
Security-critical software such as the operating system kernel and
|
||||
system services such as @code{sshd} and the batch scheduler remain under
|
||||
control of sysadmins.
|
||||
|
||||
The Guix project has a good track record delivering security updates in
|
||||
a timely fashion (@pxref{Security Updates,,, guix, GNU Guix Reference
|
||||
Manual}). To get security updates, users have to run @code{guix pull &&
|
||||
guix upgrade}.
|
||||
|
||||
Because Guix uniquely identifies software variants, it is easy to see if
|
||||
a vulnerable piece of software is in use. For instance, to check whether
|
||||
the glibc@ 2.25 variant without the mitigation patch against
|
||||
``@uref{https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt,Stack
|
||||
Clash}'', one can check whether user profiles refer to it at all:
|
||||
|
||||
@example
|
||||
guix gc --referrers /gnu/store/…-glibc-2.25
|
||||
@end example
|
||||
|
||||
This will report whether profiles exist that refer to this specific
|
||||
glibc variant.
|
||||
|
||||
|
||||
@c *********************************************************************
|
||||
@node Acknowledgments
|
||||
@chapter Acknowledgments
|
||||
|
@ -3656,8 +4049,10 @@ information on these fine people. The @file{THANKS} file lists people
|
|||
who have helped by reporting bugs, taking care of the infrastructure,
|
||||
providing artwork and themes, making suggestions, and more---thank you!
|
||||
|
||||
This document includes adapted sections from articles that have previously
|
||||
been published on the Guix blog at @uref{https://guix.gnu.org/blog}.
|
||||
This document includes adapted sections from articles that have
|
||||
previously been published on the Guix blog at
|
||||
@uref{https://guix.gnu.org/blog} and on the Guix-HPC blog at
|
||||
@uref{https://hpc.guix.info/blog}.
|
||||
|
||||
|
||||
@c *********************************************************************
|
||||
|
|
Reference in New Issue