Reviewed-on: #27
arch-zbm
Helper script to install Arch Linux with ZFSBootMenu from within a running Arch Linux live CD ISO image
Prep
We expect minimal prep on your end. Please make sure that before execution the following conditions are met.
UEFI
On a UEFI system ensure these conditions are met. See How to prep for details on how to meet these conditions.
- One GPT-partitioned disk
- Arch Linux live CD ISO image sees exactly one partition with partition type code
BF00
("Solaris root") - Arch Linux live CD ISO image sees exactly one partition with partition type code
EF00
("EFI System Partition") - The
EF00
EFI partition is mountable, in practical terms this usually only means it has a file system. - No ZFS zpool exists
Legacy BIOS
If you are instead running a legacy BIOS machine ensure these conditions are met. See How to prep for details on how to meet these conditions.
- One MBR-partitioned disk
- Arch Linux live CD ISO image sees exactly one partition with partition type code
bf
("Solaris root") - Arch Linux live CD ISO image sees exactly one partition with partition type code
83
("Linux") - The
83
Linux partition is mountable, in practical terms this usually only means it has a file system. - No ZFS zpool exists
Neither with a UEFI nor legacy BIOS system are any of these conditions a requirement from ZFSBootMenu. We're just setting requirements to easily identify if you intend to do a UEFI or a legacy BIOS install. Subsequently the script has no logic to detect UEFI or legacy BIOS mode, that's legwork left to the reader :) The Internet seems to agree that a good quick check is to see if your Arch Linux live CD ISO image has directory /sys/firmware/efi
.
[ -d /sys/firmware/efi ] && echo 'Likely a UEFI system' || echo 'Probably a legacy BIOS system'
If you're unsure nothing's stopping you from just giving it a go with a best guess and if that fails you know you guessed wrong.
How to prep
UEFI
On a blank example disk /dev/sda
you can fulfill the UEFI requirements (one EF00
partition with a file system plus one BF00
partition) for example like so:
sgdisk --new '1::+512M' --new '2' --typecode '1:EF00' --typecode '2:BF00' /dev/sda
mkfs.vfat /dev/sda1
--new '1::+512M'
: Create partition number1
. The field separator:
separates the partition number from start sector. In this case start sector is unspecified so start sector sits at whatever the system's default is for this operation. On a blank disk on an Arch Linux live CD ISO image this will default to sector2048
. Partition ends at whatever the beginning is+512M
meaning plus 512 Mebibytes.
--new '2'
: Create partition number2
. Both field number 2, the start sector, and field number 3, the end sector, are unspecified, there's no field separator:
. Field number 2 will be the first free sector - in this case right after partition 1 - and field number 3 will be end of disk. Thus partition2
will fill the remaining free disk space.
--typecode '1:EF00'
: Partition 1 gets partition type codeEF00
, an EFI System Partition.
--typecode '2:BF00'
: Partition 2 gets partition type codeBF00
, a Solaris root partition.
The result will be something like this at which point you can start the setup.sh
script, see How to run this? below for more details.
# lsblk --paths --output 'NAME,SIZE,FSTYPE,PARTTYPE,PARTTYPENAME,PTTYPE' /dev/sda
NAME SIZE FSTYPE PARTTYPE PARTTYPENAME PTTYPE
/dev/sda 10G gpt
├─/dev/sda1 512M vfat c12a7328-f81f-11d2-ba4b-00a0c93ec93b EFI System gpt
└─/dev/sda2 9.5G 6a85cf4d-1dd2-11b2-99a6-080020736631 Solaris root gpt
Legacy BIOS
For a legacy BIOS machine you'll be using a Master Boot Record (MBR) on your disk.
printf -- '%s\n' 'label: dos' 'start=1MiB, size=512MiB, type=83, bootable' 'start=513MiB, size=+, type=bf' | sfdisk /dev/sda
mkfs.vfat /dev/sda1
label: dos
: Create the following partition layout in a Master Boot Record.
start=1MiB, size=512MiB, type=83, bootable
: Partition 1 begins 1 Mebibyte after disk start and is 512 Mebibyte in size. We're setting its bootable flag and setting partition type code83
("Linux").
start=513MiB, size=+, type=bf
: Partition 2 begins right at the start of Mebibyte 513, this is the very next sector after the end of partition 1. It takes up the remaining disk space, we're assigning type codebf
("Solaris").
The result will be something like this at which point you can start the setup.sh
script, see How to run this? below for more details.
# lsblk --paths --output 'NAME,SIZE,FSTYPE,PARTTYPE,PARTTYPENAME,PTTYPE' /dev/sda
NAME SIZE FSTYPE PARTTYPE PARTTYPENAME PTTYPE
/dev/sda 10G dos
├─/dev/sda1 512M vfat 0x83 Linux dos
└─/dev/sda2 9.5G 0xbf Solaris dos
Partition naming
Since this script works with UEFI and legacy BIOS mode we'll be addressing both disk layout schemes with umbrella terms for the rest of this document for better readability: "The zpool partition" will be GPT BF00
partition and MBR bf
partition. You'll parse the text accordingly. "The boot partition" will be GPT EF00
partition as well as the MBR 83
partition.
ZFS dataset layout
The script will create a single ZFS zpool zpool
on the zpool partition with dataset child zpool/root
which itself has one child zpool/root/archlinux
, that's where Arch Linux gets installed. Parallel to zpool/root
it'll create zpool/data
with a zpool/data/home
child dataset that gets mounted at /home
.
How to run this?
- Boot an Arch Linux live CD ISO image
- Run:
During execution the script will call itself when it changes into itsexport SCRIPT_URL='https://quico.space/quico-os-setup/arch-zbm/raw/branch/main/setup.sh' && curl -s "${SCRIPT_URL}" | bash
chroot
, that's why weexport SCRIPT_URL
. Feel free to update"${SCRIPT_URL}"
with whatever branch or revision you want to use from quico.space/quico-os-setup/arch-zbm. Typically.../branch/main/setup.sh
as shown above is what you want.
Options
The following options can be given either by exporting them as shell variables prior to script execution or in a file named archzbm_settings.env
that lives in your current working directory where you're about to execute the script. You can walk yourself through an interactive questionnaire that helps create a valid archzbm_settings.env
file. Check out Command line setup help for details on the questionnaire.
If you instead want to define settings yourself with an archzbm_settings.env
file its file format is identical to shell variable assignments of the form VAR=value
or VAR='value'
.
If ./archzbm_settings.env
exists the script will source
its content and export
all variables for use in future steps.
In cases where a variable is both exported prior to script execution and specified in archzbm_settings.env
the latter will override the former.
Known options are as follows.
Kernel downgrade
By default we install newest linux
and linux-headers
packages into a chroot
. Once we're in that chroot
we then install newest AUR zfs-dkms package. You may want to override linux
and linux-headers
versions to ensure you end up with a compatible mix between them and zfs-dkms
.
For example:
export ARCHZBM_KERNEL_VER=6.5.9.arch2
In our chroot
this will trigger execution of:
downgrade --ala-only 'linux=6.5.9.arch2' 'linux-headers=6.5.9.arch2' --ignore always
Where downgrade
is the AUR downgrade package. This will downgrade linux
and linux-headers
and will add a setting to your /etc/pacman.conf
:
[options]
IgnorePkg = linux linux-headers
Setting ARCHZBM_KERNEL_VER
to an empty string ''
or keeping it undefined are both valid and will retain newest versions instead of downgrading.
Also read Kernel selection for details.
Compression
By default we create a zpool with ZFS property compression=on
. If the lz4_compress
pool feature is active this will by default enable compression=lz4
. See man 7 zfsprops
for example in ZFS 2.1.9 for details. See zpool get feature@lz4_compress <pool>
to check this feature's status on your <pool>
.
To get a zpool with uncompressed datasets export the shell variable ARCHZBM_ZFSPROPS_NO_COMPRESSION
with any value prior to running this script. Literally any value works as long as you're not setting this to an empty string:
export ARCHZBM_ZFSPROPS_NO_COMPRESSION=yesplease
Encryption
By default we encrypt the zpool with ZFS property encryption=on
. In ZFS 2.1.9 this defaults to encryption=aes-256-gcm
.
To get a zpool with unencrypted datasets export the shell variable ARCHZBM_ZFSPROPS_NO_ENCRYPTION
with any value prior to running this script:
export ARCHZBM_ZFSPROPS_NO_ENCRYPTION=yup
Passwords
By default both the zpool password and the account password for root
are literally password
. While you can certainly change these after initial system setup (see Password change) you can also optionally set passwords before script execution as follows:
ARCHZBM_ZPOOL_PASSWORD='a fancy password'
ARCHZBM_ROOT_PASSWORD='t0psecr3t!'
While the
root
password is allowed to be weak andchpasswd
won't care do make sure to set a zpool password that meets ZFS' complexity rules. Perman 7 zfsprops
sectionkeyformat
the only requirement is a length "between 8 and 512 bytes" (as in minimum 8 characters). If you pick a password that's too weak ZFS will reject zpool creation and very ungracefully derail the rest of this script. The script doesn't check what you're setting.
The script does create a second user named build
but doesn't set a password on account creation. It's intended as a helper for system setup tasks such as sudo -u build paru -S <package>
where an account password is irrelevant since root
can always sudo
whatever it wants. You will not be able to log in to the build
account yourself although you certainly could set a password for it. Instead we suggest you create a proper user account for yourself. Your newly installed Arch Linux comes with an /etc/motd
greeting that summarizes this as:
useradd --create-home --shell /bin/bash --user-group --groups wheel <user>
passwd <user>
Networking
By default the script configures plain ZFSBootMenu without networking nor an SSH server. If you're interested in SSH-ing into your ZFSBootMenu boot loader you're going to want to specify some of the following variables.
IP address
IPv6 addresses are untested. Script has been confirmed working with IPv4 addresses.
ARCHZBM_NET_CLIENT_IP=''
ARCHZBM_NET_SERVER_IP=''
ARCHZBM_NET_GATEWAY_IP=''
ARCHZBM_NET_NETMASK=''
ARCHZBM_NET_HOSTNAME=''
ARCHZBM_NET_DEVICE=''
ARCHZBM_NET_AUTOCONF=''
By default none of the variables are set to any value and no networking will be available in ZFSBootMenu. If you want networking as in an IP address bound to a network interface set at least one of these variables or one of the SSH variables listed further down. Setting one or more ARCHZBM_NET_*
variables to an empty string is valid. If at least one variable is given either from this paragraph or from SSH we're assuming that you want networking. Unspecified values and values set to the empty string ''
use defaults.
For networking we rely on the mkinitcpio-nfs-utils package with its net
hook. Please refer to its initcpio-install-net script file for usage hints on above variables. The hook implements a subset of the ip Kernel Command Line argument.
Mapping between net
hook field names and our shell variables is straightforward. Fields 8, 9 and 10 (DNS and NTP server addresses) from the official ip
docs are unsupported in net
hook. As such our hook has a total of 7 fields available for you to configure.
+-------------+------------------------+
| net hook | This script |
+-------------+------------------------+
| <client-ip> | ARCHZBM_NET_CLIENT_IP |
| <server-ip> | ARCHZBM_NET_SERVER_IP |
| <gw-ip> | ARCHZBM_NET_GATEWAY_IP |
| <netmask> | ARCHZBM_NET_NETMASK |
| <hostname> | ARCHZBM_NET_HOSTNAME |
| <device> | ARCHZBM_NET_DEVICE |
| <autoconf> | ARCHZBM_NET_AUTOCONF |
+-------------+------------------------+
A valid example with a few fields populated may look like so:
ARCHZBM_NET_CLIENT_IP='10.10.10.2'
ARCHZBM_NET_GATEWAY_IP='10.10.10.1'
ARCHZBM_NET_NETMASK='255.255.255.0'
ARCHZBM_NET_DEVICE='eth0'
ARCHZBM_NET_AUTOCONF='none'
Note that in this example ARCHZBM_NET_SERVER_IP
and ARCHZBM_NET_HOSTNAME
are left unassigned.
It'll add the following ip=
instruction to your Kernel Command Line:
ip=10.10.10.2::10.10.10.1:255.255.255.0::eth0:none
This is also valid and will configure eth0
via DHCP:
ARCHZBM_NET_DEVICE='eth0'
ARCHZBM_NET_AUTOCONF='dhcp'
In ZFSBootMenu the device names that go into
ARCHZBM_NET_DEVICE
are raw unchanged kernel device names such aseth0
. If you're unsure which device name to use in your Arch Linux live CD ISO image checkdmesg
output. During boot typically a kernel module will first assign the raw kernel device name then latersystemd
will enforce Predictable Network Interface Names.In
dmesg | grep
on a physical PC with an MSI B550-A Pro mainboard from 2020 that comes with one onboard Realtek RTL8111H network adapter governed by the Realtek RTL-8169 Gigabit Ethernet driver from ther8169
kernel module you will for example see:# dmesg -T | grep eth [time] r8169 0000:2a:00.0 eth0: RTL8168h/8111h, 04:7c:16:00:01:02, XID 541, IRQ 95 [time] r8169 0000:2a:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] [time] r8169 0000:2a:00.0 enp42s0: renamed from eth0
Notice how a Predictable Network Interface Name comes in on line 3. What you need here is the
eth0
part.
SSH
If you want networking indicated by the fact that at least one of the ARCHZBM_NET_*
variables is set or one of the ARCHZBM_SSH_*
vars we assume that you want an SSH daemon as well. This comes in the form of a dropbear
daemon with minimal configurability. Use the following variables to define Dropbear's behavior.
ARCHZBM_SSH_PORT='22'
ARCHZBM_SSH_KEEPALIVE_INTVL='1'
ARCHZBM_SSH_AUTH_KEYS=''
In ARCHZBM_SSH_PORT
you specify Dropbear's listening port, this defaults to 22
if unconfigured or set to an empty string. With ARCHZBM_SSH_KEEPALIVE_INTVL
you define at which interval Dropbear will send keepalive messages to an SSH client through the SSH connection. This defaults to 1
as in every 1
second a keepalive message is sent. Per man 8 dropbear a value of 0
disables Dropbear sending keepalive messages. We suggest to leave this on and to keep the interval short, see SSH in ZFSBootMenu for how to work with this.
Dropbear in this setup only supports key-based authentication, no password-based authentication. The value from ARCHZBM_SSH_AUTH_KEYS
will be converted to a list of public SSH keys allowed to SSH into Dropbear as its default root
user while ZFSBootMenu is running. The format of ARCHZBM_SSH_AUTH_KEYS
is a single line where authorized_keys
entries are split with double-commas:
ssh-rsa Eahajei8,,ssh-ed25519 kaeD0mas ...
This syntax crutch allows you to use the full range of Dropbear-supported authorized_keys
stanzas, see man 8 dropbear for what's available. Whether or not this is useful to you is another topic :) At least the functionality for stanzas is there by separating values in ARCHZBM_SSH_AUTH_KEYS
with double-commas.
Command line setup help
An interactive questionnaire can guide you through settings and goes like this:
To do the questionnaire yourself start this script with the setup
argument:
export SCRIPT_URL='https://quico.space/quico-os-setup/arch-zbm/raw/branch/main/setup.sh' && curl -s "${SCRIPT_URL}" | bash -s -- setup
When done rerun it without that argument:
export SCRIPT_URL='https://quico.space/quico-os-setup/arch-zbm/raw/branch/main/setup.sh' && curl -s "${SCRIPT_URL}" | bash
Steps
The script takes the following installation steps.
- Install ZFS tools and kernel module with github.com/eoli3n/archiso-zfs
- Create one ZFS zpool on top of zpool partition, encrypted and compressed datasets, password
password
- See paragraph Passwords to predefine your own passwords in a settings file
- See paragraphs Compression/Encryption to optionally disable properties
- Create dataset for Arch Linux and
/home
- Install Arch Linux into pool
- Add ZFSBootMenu to boot partition
- Configure boot method
- Either an EFI image with EFI boot order entries on a UEFI machine
- Or Syslinux with
extlinux
for a legacy BIOS computer
- If requested by user enable SSH in ZFSBootMenu. We then also add:
- Add
pacman
hooks to keep ZFSBootMenu images (andextlinux
) updated - Exit into Arch Linux live CD ISO image shell for you to
reboot
and frolick
SSH in ZFSBootMenu
Per SSH and Networking this script will optionally add a Dropbear SSH daemon to ZFSBootMenu. While the mechanism of SSH-ing into a server isn't particularly noteworthy we humbly suggest that in this particular use case you let your SSH client listen for keepalive messages from the server.
ssh -o ServerAliveInterval=3 -o ServerAliveCountMax=0 root@<addr> -p <port>
A typical workflow with Dropbear is for you to SSH into it, issue zfs
or zfsbootmenu
commands and allow the Arch Linux boot process to commence. As soon as you're done Dropbear will terminate as ZFSBootMenu hands control off to your operating system's kernel. Without your client listening to keepalive messages it may not realize that the connection's gone for quite some time until you harshly interrupt it.
The server defaults to sending keepalive messages to your client every second.
With -o ServerAliveInterval=3
you instruct your client to send an are-your-still-there message to the server if your client ever stops getting keepalive messages from the server for 3 seconds. The server defaults to sending 1 keepalive ping per second so even on a somewhat lossy connection we can reasonably expect to get one message through to us within 3 seconds.
When it comes to the point that your SSH client sends an are-your-still-there message it expects a near-realtime response. It will accept -o ServerAliveCountMax=0
failures from the server to comply.
This effectively configures your SSH client to remain connected even through somewhat lossy hops to the Dropbear daemon; and to cleanly disconnect 3 seconds and some change after you've executed whatever you needed to do in ZFSBootMenu.
Kernel selection
This script compiles ZFS via Arch Linux' Dynamic Kernel Module Support (DKMS). Not all kernels allow for successful compilation, in some instances a particularly recent kernel version may change APIs to such a degree that ZFS compilation simply fails.
We strongly suggest to that you:
- Firstly, refer to a resource such as the Arch Linux Archive package version list to find out what newest kernel version this script will install.
- Secondly, research if newest AUR zfs-dkms package is compatible with that kernel. Two reasonable points of contact are AUR and its comments section for
zfs-dkms
where users quickly report issues; and the github.com/openzfs/zfs issues list.
An example for this is that linux-6.6.1.arch1-1-x86_64
came out on Wednesday, November 8, 2023 at a time when newest zfs-dkms
package version was 2.2.0 which did not compile against linux
6.6.x.
You'd then set for example:
export ARCHZBM_KERNEL_VER=6.5.9.arch2
Where any 6.5.x version is known to work well with zfs-dkms
. See also Kernel downgrade for details on how to configure this.
Flavor choices
We make the following opinionated flavor choices. Feel free to change them to your liking.
- Arch Linux locale is set to
en_US.UTF-8
- Keymap is
de-latin1
- Consult
/etc/vconsole.conf
- Change
zfs set org.zfsbootmenu:commandline=...
- Consult
- No X.Org Server, Wayland compositors or other GUI elements get installed
- Timezone is
Etc/UTC
- Check
timedatectl set-timezone <tzdata-zone>
- Check
Post-run manual steps
After installation you're going to want to at least touch these points in your new Arch Linux install:
- Package manager hook:
pacman
does not have a hook to do ZFS snapshots- See quico.space/quico-os-setup/zfs-pacman-hook for an example you may want to install
- Hostname: Installation chose a pseudo-randomly generated 8-character string with
pwgen
- Check
hostnamectl set-hostname <hostname>
- Check
- Unprivileged user accounts: The OS was installed with
root
and unprivilegedbuild
users - Unless you had a settings file or exported shell env vars per Passwords you're going to want to change passwords now:
- ZFS: The password for all datasets underneath
zpool
ispassword
. - Local
root
account: The localroot
account's password ispassword
.
- ZFS: The password for all datasets underneath
- Arch User Repository (AUR) helper: We installed paru as our AUR helper, we installed from AUR as paru-bin.
- In
/etc/systemd/network/50-wired.network
instead of a DHCP-based network config you can get a static one. The DHCP-based one for reference looks like:
A static config does away with the... [Network] DHCP=ipv4 IPForward=yes Domains=~. [DHCP] UseDNS=yes RouteMetric=10
[DHCP]
section:... [Network] Address=10.10.10.2/24 Gateway=10.10.10.1 DNS=10.10.10.1 IPForward=yes Domains=~.
- In case you later want a graphical interface and specifically NetworkManager (via package
networkmanager
) consider telling it to keep its hands off of some of your network interfaces. The bullet point above adds asystemd
-style config file thatsystemd-networkd.service
will read and use. Should you ever install NetworkManager it will by default assume that it must manage all interfaces. It'll use its own DHCP client to try and get IP addresses for managed interfaces in which case you'll end up with whatever addressing scheme you configured in a.network
unit file plus NetworkManager's additional address. Create/etc/NetworkManager/conf.d/99-unmanaged-devices.conf
for example to declare some interfaces as off-limits or unmanaged:
Check out ArchWiki article "NetworkManager" section "Ignore specific devices" for more info.[keyfile] unmanaged-devices=mac:52:54:00:74:79:56;type:ethernet
Password change
After installation you're going to want to change your ZFS encryption password (unless you preconfigured a good zpool password in a settings file per Passwords). At any rate you still want to be familiar with the process and its caveat in case you ever need a zpool password change or want to do one now.
Steps
In a running OS:
- Change password in
keylocation
file, e.g./etc/zfs/zpool.key
or whatever other"${zpool_name}"'.key'
file you used during setup - Set this key as the new encryption key:
Quotingzfs change-key -l zpool
man 8 zfs-change-key
fromzfs-utils
version 2.1.9 for the-l
argument: "Ensures the key is loaded before attempting to change the key." When successful the command will not output data, it'll just silently change your encryption key. - Rebuild initramfs:
Here for example withmkinitcpio -P
-P
(--allpresets
) which processes all presets contained in/etc/mkinitcpio.d
. This step puts the changed key file into your initramfs. During setup we've adjusted/etc/mkinitcpio.conf
so that it containsFILES=(/etc/zfs/zpool.key)
which causes the file to be added to initramfs as-is.
Boot flow
With your password changed in two locations (key file and initramfs) The boot process works as follows.
At boot time ZFSBootMenu will scan all pools that it can import for a bootfs
property. If it only finds one pool with that property the dataset given as bootfs
will be selected for boot with a 10-second countdown allowing manual interaction. With bootfs
set ZFSBootMenu will not actively search through datasets for valid kernel and initramfs combinations, it'll instead accept bootfs
as the default boot entry without us entering the pool decryption passphrase.
Upon loading into a given dataset ZFSBootMenu will attempt to auto-load the matching decryption key. In our setup this will fail because we purposely stored the encryption key inside our zpool/root/archlinux
dataset. ZFSBootMenu will prompt us to type in the decryption key.
Lastly ZFSBootMenu loads our OS' kernel and initramfs combination via kexec
. For this step we don't need to enter the decryption key again. Our initramfs file contains the plain-text /etc/zfs/zpool.key
file which allows it to seamlessly import the right dataset, load its key and mount it.
Caveats in a password change
ZFS differentiates between user keys - also called wrapping keys - and the master key for any given encryption root. You never interact with the master key, you only pick your personal user key. Subsequently a user key change (in our use case we perceive this simply as a password change) has zero effect on data that's already encrypted. The operation is instant and merely reencrypts the already existing master key, the so-called wrapped master key.
ZFS generates the master key exactly once when you enable encryption on a dataset - technically when it becomes an encryption root. Among other inputs it uses your user key to encrypt (to wrap) the master key. When you change your user key it just means that the master key stays exactly the same and only the encrypted (wrapped) key changes.
man 8 zfs-change-key
from zfs-utils
version 2.1.9 adds:
If the user's key is compromised,
zfs change-key
does not necessarily protect existing or newly-written data from attack. Newly-written data will continue to be encrypted with the same master key as the existing data. The master key is compromised if an attacker obtains a user key and the corresponding wrapped master key. Currently,zfs change-key
does not overwrite the previous wrapped master key on disk, so it is accessible via forensic analysis for an indeterminate length of time.In the event of a master key compromise, ideally the drives should be securely erased to remove all the old data (which is readable using the compromised master key), a new pool created, and the data copied back. This can be approximated in place by creating new datasets, copying the data (e.g. using
zfs send | zfs recv
), and then clearing the free space withzpool trim --secure
if supported by your hardware, otherwisezpool initialize
.
On one hand changing the ZFS encryption password is generally a good and useful thing to do. On the other hand changing your password does not currently overwrite previous wrapped master keys on disk. A sufficiently motivated party that gains access to a wrapped master key and the matching user key is able to decrypt the master key and use it to read all data encrypted with it.
By extension this means after a password change your data remains at risk until you've copied it to a new dataset and erased previously used space thereby erasing any previous wrapped master keys.
Changing master key
In order to generate a new master key after you've changed your user key as mentioned in man 8 zfs-change-key
from zfs-utils
version 2.1.9 one example workflow goes like this:
- Change user key
- Update
/etc/zfs/zpool.key
- Update zpool with new key via
zfs change-key -l zpool
- Generate new initramfs with
mkinitcpio -P
- Update
- Create a snapshot from current system dataset
# Assuming current system dataset is zpool/root/archlinux-sxu # where '-sxu' is a random suffix to differentiate datasets # and has no real meaning zfs snapshot zpool/root/archlinux-sxu@rekey
- Within same pool
send
/receive
snapshot
Explanation:zfs send \ --large-block \ --compressed \ 'zpool/root/archlinux-sxu@rekey' | \ \ zfs receive \ -Fvu \ -o 'encryption=on' \ -o 'keyformat=passphrase' \ -o 'keylocation=file:///etc/zfs/zpool.key' \ -o 'mountpoint=/' \ -o 'canmount=noauto' \ -o 'org.zfsbootmenu:commandline=rw nowatchdog rd.vconsole.keymap=de-latin1' \ 'zpool/root/archlinux-frn'
- We specifically don't
zfs send -R
(--replicate
). While it would normally be nice to transfer all of a dataset's children at once such as all of its snapshots the-R
argument conflicts with theencryption
property. See comment by Tom Caputi on GitHub openzfs/zfs issue 10507 from June 2020 for details. Basically ifencryption
is set then-R
doesn't work. We could transfer existing encryption properties with-w
/--raw
but we don't actually want to transfer encryption properties at all. We want them to change during transfer, see the bullet point four points down from here talking aboutencryption
. - We
zfs receive -F
destroying any target snapshots and file systems beyond the snapshot we're transferring. In this example the targetzpool/root/archlinux-frn
doesn't even exist so-F
isn't necessary to clean anything up. It's just good practice. - With
-v
we get verbose progress output - Argument
-u
makes sure the dataset does not get mounted after transfer. ZFS would mount it into/
which wouldn't be helpful since we're currently using that filesystem ourselves. - We set encryption properties
keyformat
,keylocation
and most importantlyencryption
. The latter will turn our transferred dataset into its ownencryptionroot
which in turn generates a new master key. The auto-generated new master key gets wrapped with our updated passphrase inkeylocation
. This basically reencrypts all data in this dataset during transfer. - We set
mountpoint
andcanmount
as well as anorg.zfsbootmenu:commandline
as we would for any new system dataset.
- We specifically don't
- Change zpool's
bootfs
property to new system datasetzpool set bootfs=zpool/root/archlinux-frn zpool
- Boot into new system dataset
- After reboot and now that you're in the new system dataset change its
encryptionroot
by letting it inherit data from its parent:
The parentzfs change-key -i -l zpool/root/archlinux-frn
zpool/root
is inheriting this property fromzpool
which will make sure thatzpool/root/archlinux-frn
essentially gets its key now fromzpool
. Bothzpool/root/archlinux-frn
andzpool
use the same exactkeylocation
with identical content. This operation is instant.
Finishing touches
Confirm master key change
Just to confirm that the master key has changed run this command. It takes a moment to output data:
zfs send --raw zpool/root/archlinux-frn@rekey | zstream dump | sed -n -e '/crypt_keydata/,/end crypt/p; /END/q'
Repeat for source dataset zpool/root/archlinux-sxu@rekey
. You're particularly interested in parameters DSL_CRYPTO_MASTER_KEY_1
and the initialization vector DSL_CRYPTO_IV
. Notice that they differ between old and new dataset confirming that your new dataset has a new master key.
Clean-up
Clean up:
- In newly keyed/reencrypted system dataset destroy its snapshot
zfs destroy zpool/root/archlinux-frn@rekey
- Recursively destroy source dataset
zfs destroy -r zpool/root/archlinux-sxu
Unmap/TRIM
Next up unmap/TRIM unallocated disk areas. If your zpool runs on an entire disk and not just on a partition, and if your disk supports TRIM you're going to want to do:
zpool trim --secure zpool
The next best alternative is to instead do:
zpool initialize zpool
View status with either one of:
# With TRIM status
zpool status -t zpool
# Without TRIM status
zpool status zpool
ZFS setup explained
Overview
The ZFS pool and dataset setup that makes this tick, explained in plain English.
- Create zpool with options:
-R /mnt
(aka-o cachefile=none -o altroot=/mnt
). The pool is never cached, i.e. it's considered temporary. All pool and dataset mount paths have/mnt
prepended. Fromman zpoolprops
:This can be used when examining an unknown pool where the mount points cannot be trusted, or in an alternate boot environment, where the typical paths are not valid.
altroot
is not a persistent property. It is valid only while the system is up.-O canmount=off
: Note the capital-O
which makes this a file system property, not a pool property. File system cannot be mounted, and is ignored byzfs mount -a
. This property is not inherited.-O mountpoint=none
: What it says on the tin, the pool has no mountpoint configured.-O encryption=on
: Makes this ourencryptionroot
and passes theencryption
setting to all child datasets. Selectingencryption=on
when creating a dataset indicates that the default encryption suite will be selected, which is currentlyaes-256-gcm
.-O keylocation=file://...
: This property is only set for encrypted datasets which are encryption roots. Controls where the user's encryption key will be loaded from by default for commands such aszfs load-key
.-O keyformat=passphrase
: Controls what format the user's encryption key will be provided as. Passphrases must be between 8 and 512 bytes long.
- At this time the newly created zpool is not mounted anywhere. Next we create the "root" dataset, that's an arbitary term for the parent dataset of all boot environments. Boot environments in your case may be for example different operating systems all of which live on separate datasets underneath the root.
-o canmount=off
: Same as above, the root dataset can - just like the pool - not be mounted.-o mountpoint=none
: Same as above, the root dataset has - just like the pool - no mountpoint configured.zfs set org.zfsbootmenu:commandline=...
: Set a common kernel command line for all boot environments such as"ro quiet"
.
- Neither the root dataset nor the pool are mounted at this time. We now create one boot environment dataset where we want to install Arch Linux.
-o mountpoint=/
: Our Arch Linux dataset will be mounted at/
.-o canmount=noauto
: When set tonoauto
, a dataset can only be mounted and unmounted explicitly. The dataset is not mounted automatically when the dataset is created or imported, nor is it mounted by thezfs mount -a
command or unmounted by thezfs unmount -a
command.- We then
zpool set bootfs="zpool/root/archlinux" zpool
: ZFSBootMenu uses thebootfs
property to identify suitable boot environments. If only one pool has it - as is the case here - it identifies the pool's preferred boot dataset that will be booted with a 10-second countdown allowing manual interaction in ZFSBootMenu. - We explicitly mount the boot environment. Since the entire pool is still subject to our initial
-R /mnt
during creation azfs mount zpool/root/archlinux
will mount the Arch Linux dataset not into/
but instead into/mnt
.
- We also create a
data
dataset that - at least for now - we use to store only our/home
data.- For
zpool/data
:-o mountpoint=/
: We use themountpoint
property here only for inheritance.-o canmount=off
: Thezpool/data
dataset itself cannot actually be mounted.
- For a
zpool/data/home
child dataset:- We do not specify any properties. Since
canmount
cannot be inherited the parent'scanmount=off
does not apply, it instead defaults tocanmount=on
. The parent'smountpoint=/
property on the other hand is inherited so for ahome
child dataset it conveniently equalsmountpoint=/home
. - In effect this
zpool/data/home
dataset is subject tozfs mount -a
and will happily automount into/home
.
- We do not specify any properties. Since
- For
- We export the zpool once, we then reimport it by scanning only inside
/dev/disk/by-partuuid
, again setting-R /mnt
as we did during pool creation a moment ago and we do not mount any file systems. - We
zfs load-key <encryptionroot>
which will load the key fromkeylocation
after which thekeystatus
property for<encryptionroot>
and all child datasets will change fromunavailable
toavailable
. - We mount our Arch Linux boot environment dataset. It automatically gets prefixed with
-R /mnt
since that's how we imported the pool. - We
zfs mount -a
which automountszpool/data/home
into/home
, which again gets auto-prepended by/mnt
. - We lastly mount our EFI partition into
/mnt/efi
. - We instruct ZFS to save its pool configuration via
zpool set cachefile=/etc/zfs/zpool.cache zpool
.
The complete ZFS structure now exists and is mounted at /mnt
ready for any pacstrap
, debootstrap, dnf --installroot
or other bootstrapping action.
Adding another boot environment-independent dataset
Assume that in addition to your /home
data which lives on zpool/data/home
you want another dataset that is exempt from Arch Linux snapshots.
Consider an example /opt/git
directory where a bunch of Git repos are checked out on which you work. You don't want them to be snapshotted - and rolled back - when something goes sideways: they are decoupled from everything else that goes on on your machine so you can easily and safely have a static /opt/git
directory available in all boot environments.
Move your current /opt/git
data out of the way for a moment:
mv '/opt/git'{,'.bak'}
Create datasets
zfs create -o canmount=off zpool/data/opt
zfs create zpool/data/opt/git
Remember that the zpool/data
dataset already exists and that it has both mountpoint=/
and canmount=off
set. It is not and cannot be mounted itself, it instead conveniently anchors datasets at /
. Since the canmount
dataset property cannot be inherited and defaults to canmount=on
we have to manually specify -o canmount=off
. Our new zpool/data/opt
should not automatically mount into /opt
.
We then create the child dataset zpool/data/opt/git
, it defaults to canmount=on
thus immediately shows up at /opt/git
.
Move data back into place and clean up temp directory
rsync -av --remove-source-files '/opt/git'{'.bak',}'/'
find '/opt/git.bak' -type d -empty -delete
An example zpool/data
dataset may now look like so:
# zfs list -r -oname,mountpoint,canmount,mounted zpool/data
NAME MOUNTPOINT CANMOUNT MOUNTED
zpool/data / off no
zpool/data/home /home on yes
zpool/data/opt /opt off no
zpool/data/opt/git /opt/git on yes
Nested environment-independent datasets
Caution
If you want a dedicated dataset for a directory that lives deeper in your file system tree than just /opt/git
, for example like /var/lib/docker
make sure to not recursively create this structure in a single zfs create
command.
In Adding another boot environment-independent dataset above you can safely do:
zfs create -o canmount=off zpool/data/opt
Here zpool/data
already exists, you're only creating one child dataset opt
and you're setting -o canmount=off
so that it never mounts into your /opt
directory.
Now consider the same setup for /var/lib/docker
. If you follow the exact same approach:
zfs create -o canmount=off zpool/data/var/lib
Docker will correctly report:
cannot create 'zpool/data/var/lib': parent does not exist
You might want to just create the parent then with -p
argument:
zfs create -p -o canmount=off zpool/data/var/lib
~~
Note, however, that -o canmount=off
only applies to lib
dataset and that zpool/data/var
has just been auto-mounted into /var
:
# zfs list -r -oname,mountpoint,canmount,mounted zpool/data
NAME MOUNTPOINT CANMOUNT MOUNTED
zpool/data / off no
zpool/data/home /home on yes
zpool/data/opt /opt off no
zpool/data/opt/git /opt/git on yes
zpool/data/var /var on yes <---
zpool/data/var/lib /var/lib off no
Advice
Instead create nested parents in multiple steps where you set each one to -o canmount=off
:
zfs create -o canmount=off zpool/data/var
zfs create -o canmount=off zpool/data/var/lib
Lastly create the dataset you want mounted:
zfs create zpool/data/var/lib/docker
Mounting zpool for maintenance
In case you want to mount your zpool on an external operating system such as an Arch Linux live CD ISO image do it like so:
zpool import zpool -d /dev/disk/by-partuuid -R /mnt -f -N
zfs load-key -L prompt zpool
zfs mount zpool/root/archlinux
zfs mount -a
# UEFI system ...
mount /dev/sda1 /mnt/efi
# ... or legacy BIOS system
mount /dev/sda1 /mnt/boot/syslinux
arch-chroot /mnt /bin/bash
When done exit chroot
and cleanly remove your work:
# UEFI system ...
umount /mnt/efi
# ... or legacy BIOS system
umount /mnt/boot/syslinux
zfs umount -a
zpool export zpool
Explanation:
-
We always want to mount pools
by-partuuid
for consistency so we specifically only look for pools at/dev/disk/by-partuuid
. -
We mount our zpool with
-R /mnt
(aka-o cachefile=none -o altroot=/mnt
). The pool is never cached, i.e. it's considered temporary. All pool and dataset mount paths have/mnt
prepended. Fromman zpoolprops
:This can be used when examining an unknown pool where the mount points cannot be trusted, or in an alternate boot environment, where the typical paths are not valid.
altroot
is not a persistent property. It is valid only while the system is up. -
With
-f
and-N
we force-mount our pool (-f
) even if it previously wasn't cleanly exported; and we do not auto-mount any of its datasets (-N
), not even the ones that havecanmount=on
set.# zfs list -oname,mountpoint,canmount,mounted NAME MOUNTPOINT CANMOUNT MOUNTED zpool none off no zpool/data /mnt off no zpool/data/home /mnt/home on no <-- Not immediately mounted zpool/root none off no zpool/root/archlinux /mnt noauto no <-- Not immediately mounted
-
We load the decryption key by temporarily overriding the
keylocation
property to-L prompt
. The default value isfile:///etc/zfs/zpool.key
which in all likelihood doesn't exist in this environment. -
We mount our desired boot environment with
zfs mount zpool/root/archlinux
# zfs list -oname,mountpoint,canmount,mounted NAME MOUNTPOINT CANMOUNT MOUNTED zpool none off no zpool/data /mnt off no zpool/data/home /mnt/home on no zpool/root none off no zpool/root/archlinux /mnt noauto yes <-- Only boot env now mounted
-
We mount all child datasets with
zfs mount -a
making/mnt/home
available as well as any others you may have created yourself.# zfs list -oname,mountpoint,canmount,mounted NAME MOUNTPOINT CANMOUNT MOUNTED zpool none off no zpool/data /mnt off no zpool/data/home /mnt/home on yes <-- Now mounted zpool/root none off no zpool/root/archlinux /mnt noauto yes <-- Now mounted
-
We lastly mount our EFI System Partition (ESP), in this example it's living at
/dev/sda1
so adjust this path accordingly.# df -hTP Filesystem Type Size Used Avail Use% Mounted on ... ... ... ... ... ... ... zpool/root/archlinux zfs 8.6G 2.5G 6.2G 29% /mnt zpool/data/home zfs 6.3G 161M 6.2G 3% /mnt/home /dev/sda1 vfat 511M 31M 481M 6% /mnt/efi
-
We're ready to
arch-chroot
into our boot environment.
Development
Conventional commits
This project uses Conventional Commits for its commit messages.
Commit types
Commit types besides fix
and feat
are:
build
: Project structure, directory layout, build instructions for roll-outrefactor
: Keeping functionality while streamlining or otherwise improving function flowtest
: Working on test coveragedocs
: Documentation for project or components
Commit scopes
The following scopes are known for this project. A Conventional Commits commit message may optionally use one of the following scopes or none:
iso
: Changing Arch Linux live CD ISO imagezbm
: Adjusting ZFSBootMenu's behaviorzfs
: A change to how ZFS interacts with the system, either a pool or a datasetos
: Getting an operating system set up to correctly work in a ZFS boot environmentmeta
: Affects the project's repo layout, readme content, file names etc.
Credits
Most of what's here was shamelessly copied and slightly adapted for personal use from Jonathan Kirszling at GitHub.
Thanks to:
- Jonathan Kirszling:
- Maurizio Oliveri:
- Zach Dykstra, Andrew J. Hesford and all other ZFSBootMenu contributors:
- github.com/kongkrit: