quico-os-setup/arch-zbm

Fork 0

hygienic-books ec7931cb4c docs(zfs): Warn about nested datasets creation and auto-mounting (#3 )

2023-05-16 14:11:14 +02:00

26 KiB

Raw Blame History

arch-zbm

Helper script to install Arch Linux with ZFSBootMenu from within a running Arch Linux live CD ISO image

Prep

We expect minimal prep on your end. Please make sure that before execution the following conditions are met.

Arch Linux live CD ISO image sees exactly one partition with partition type code BF00 ("Solaris root")
Arch Linux live CD ISO image sees exactly one partition with partition type code EF00 ("EFI system partition")
The EF00 EFI partition is mountable, in practical terms this usually only means it has a file system.
No ZFS zpool exists

How to prep

On a blank example disk /dev/sda you can fulfill the requirements (One EF00 partition with a file system plus one BF00 partition) for example like so:

sgdisk --new '1::+100M' --new '2' --typecode '1:EF00' --typecode '2:BF00' /dev/sda
mkfs.vfat /dev/sda1

--new '1::+100M': Create partition number 1. The field separator : separates the partition number from start sector. In this case start sector is unspecified so start sector sits at whatever the system's default is for this operation. On a blank disk on an Arch Linux live CD ISO image this will default to sector 2048. Partition ends at whatever the beginning is +100M meaning plus 100 Mebibytes.

--new '2': Create partition number 2. Both field number 2, the start sector, and field number 3, the end sector, are unspecified, there's no field separator :. Field number 2 will be the first free sector - in this case right after partition 1 - and field number 3 will be end of disk. Thus partition 2 will fill the remaining free disk space.

--typecode '1:EF00': Partition 1 gets partition type code EF00, an EFI system partition.

--typecode '2:BF00': Partition 2 gets partition type code BF00, a Solaris root partition.

The result will be something like this at which point you can start the setup.sh script, see How to run this? below for more details.

# lsblk --paths 
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
/dev/loop0     7:0    0 688.5M  1 loop /run/archiso/airootfs
/dev/sr0      11:0    1 810.3M  0 rom  /run/archiso/bootmnt
/dev/sda     202:0    0    10G  0 disk 
├─/dev/sda1  202:1    0   100M  0 part
└─/dev/sda2  202:2    0   9.9G  0 part

ZFS dataset layout

The script will create a single ZFS zpool zpool on the BF00 partition with dataset child zpool/root which itself has one child zpool/root/archlinux, that's where Arch Linux gets installed. Parallel to zpool/root it'll create zpool/data with a zpool/data/home child dataset that gets mounted at /home.

The script will use the EF00 partition to install a ZFSBootMenu EFI executable if efibootmgr says that no such ZFSBootMenu entry exists. If ZFSBootMenu gets added to the EFI partition it'll become primary boot option.

How to run this?

Boot an Arch Linux live CD ISO image
Run:
```
export SCRIPT_URL='https://quico.space/quico-os-setup/arch-zbm/raw/branch/main/setup.sh'
curl -s "${SCRIPT_URL}" | bash
```
During execution the script will call itself when it changes into its chroot, that's why we export SCRIPT_URL. Feel free to update "${SCRIPT_URL}" with whatever branch or revision you want to use from quico.space/quico-os-setup/arch-zbm. Typically .../branch/main/setup.sh as shown above is what you want.

Options

Compression

By default we create a zpool with ZFS property compression=on. If the lz4_compress pool feature is active this will by default enable compression=lz4. See man 7 zfsprops for example in ZFS 2.1.9 for details. See zpool get feature@lz4_compress <pool> to check this feature's status on your <pool>.

To get a zpool with uncompressed datasets export the shell variable ARCHZBM_ZFSPROPS_NO_COMPRESSION with any value prior to running this script. Literally any value works as long as you're not setting this to an empty string:

export ARCHZBM_ZFSPROPS_NO_COMPRESSION=yesplease

Encryption

By default we encrypt the zpool with ZFS property encryption=on. In ZFS 2.1.9 this defaults to encryption=aes-256-gcm.

To get a zpool with unencrypted datasets export the shell variable ARCHZBM_ZFSPROPS_NO_ENCRYPTION with any value prior to running this script:

export ARCHZBM_ZFSPROPS_NO_ENCRYPTION=yup

Steps

The script takes the following installation steps.

Install ZFS tools and kernel module with github.com/eoli3n/archiso-zfs
Create one ZFS zpool on top of BF00 partition, encrypted and compressed datasets, password password
1. See paragraphs Compression/Encryption to optionally disable properties
Create dataset for Arch Linux and /home
Install Arch Linux into pool
Add ZFSBootMenu to EF00 partition if it doesn't exist already
Exit into Arch Linux live CD ISO image shell for you to reboot and frolick

Flavor choices

We make the following opinionated flavor choices. Feel free to change them to your liking.

Arch Linux locale is set to en_US.UTF-8
Keymap is de-latin1
- Consult /etc/vconsole.conf
- Change zfs set org.zfsbootmenu:commandline=...
No X.Org Server, Wayland compositors or other GUI elements get installed
Timezone is Etc/UTC
- Check timedatectl set-timezone <tzdata-zone>

Post-run manual steps

After installation you're going to want to at least touch these points in your new Arch Linux install:

Package manager hook: pacman does not have a hook to do ZFS snapshots
- See this GitHub gist and zfs-snapshotter.bash for inspiration
Hostname: Installation chose a pseudo-randomly generated 8-character string with pwgen
- Check hostnamectl set-hostname <hostname>
Unprivileged user accounts: The OS was installed with root and unprivileged build users
Passwords
- ZFS: The password for all datasets underneath zpool is password.
- Local root account: The local root account's password is password.
Arch User Repository (AUR) helper: We installed paru as our AUR helper, we installed from GitHub via makepkg -si.
In /etc/systemd/network/50-wired.network instead of a DHCP-based network config you can get a static one. The DHCP-based one for reference looks like:
```
...

[Network]
DHCP=ipv4
IPForward=yes
Domains=~.

[DHCP]
UseDNS=yes
RouteMetric=10
```
A static config does away with the [DHCP] section:
```
...

[Network]
Address=10.10.10.2/24
Gateway=10.10.10.1
DNS=10.10.10.1
IPForward=yes
Domains=~.
```

Password change

After installation you're going to want to change your ZFS encryption password.

Steps

In a running OS:

Change password in keylocation file, e.g. /etc/zfs/zpool.key or whatever other "${zpool_name}"'.key' file you used during setup
Set this key as the new encryption key:
```
zfs change-key -l zpool
```
Quoting man 8 zfs-change-key from zfs-utils version 2.1.9 for the -l argument: "Ensures the key is loaded before attempting to change the key." When successful the command will not output data, it'll just silently change your encryption key.
Rebuild initramfs:
```
mkinitcpio -P
```
Here for example with -P (--allpresets) which processes all presets contained in /etc/mkinitcpio.d. This step puts the changed key file into your initramfs. During setup we've adjusted /etc/mkinitcpio.conf so that it contains FILES=(/etc/zfs/zpool.key) which causes the file to be added to initramfs as-is.

Boot flow

With your password changed in two locations (key file and initramfs) The boot process works as follows.

At boot time ZFSBootMenu will scan all pools that it can import for a bootfs property. If it only finds one pool with that property the dataset given as bootfs will be selected for boot with a 10-second countdown allowing manual interaction. With bootfs set ZFSBootMenu will not actively search through datasets for valid kernel and initramfs combinations, it'll instead accept bootfs as the default boot entry without us entering the pool decryption passphrase.

Upon loading into a given dataset ZFSBootMenu will attempt to auto-load the matching decryption key. In our setup this will fail because we purposely stored the encryption key inside our zpool/root/archlinux dataset. ZFSBootMenu will prompt us to type in the decryption key.

Lastly ZFSBootMenu loads our OS' kernel and initramfs combination via kexec. For this step we don't need to enter the decryption key again. Our initramfs file contains the plain-text /etc/zfs/zpool.key file which allows it to seamlessly import the right dataset, load its key and mount it.

Caveats in a password change

ZFS differentiates between user keys - also called wrapping keys - and the master key for any given encryption root. You never interact with the master key, you only pick your personal user key. Subsequently a user key change (in our use case we perceive this simply as a password change) has zero effect on data that's already encrypted. The operation is instant and merely reencrypts the already existing master key, the so-called wrapped master key.

ZFS generates the master key exactly once when you enable encryption on a dataset - technically when it becomes an encryption root. Among other inputs it uses your user key to encrypt (to wrap) the master key. When you change your user key it just means that the master key stays exactly the same and only the encrypted (wrapped) key changes.

man 8 zfs-change-key from zfs-utils version 2.1.9 adds:

If the user's key is compromised, zfs change-key does not necessarily protect existing or newly-written data from attack. Newly-written data will continue to be encrypted with the same master key as the existing data. The master key is compromised if an attacker obtains a user key and the corresponding wrapped master key. Currently, zfs change-key does not overwrite the previous wrapped master key on disk, so it is accessible via forensic analysis for an indeterminate length of time.

In the event of a master key compromise, ideally the drives should be securely erased to remove all the old data (which is readable using the compromised master key), a new pool created, and the data copied back. This can be approximated in place by creating new datasets, copying the data (e.g. using zfs send | zfs recv), and then clearing the free space with zpool trim --secure if supported by your hardware, otherwise zpool initialize.

On one hand changing the ZFS encryption password is generally a good and useful thing to do. On the other hand changing your password does not currently overwrite previous wrapped master keys on disk. A sufficiently motivated party that gains access to a wrapped master key and the matching user key is able to decrypt the master key and use it to read all data encrypted with it.

By extension this means after a password change your data remains at risk until you've copied it to a new dataset and erased previously used space thereby erasing any previous wrapped master keys.

Changing master key

In order to generate a new master key after you've changed your user key as mentioned in man 8 zfs-change-key from zfs-utils version 2.1.9 one example workflow goes like this:

Change user key
- Update /etc/zfs/zpool.key
- Update zpool with new key via zfs change-key -l zpool
- Generate new initramfs with mkinitcpio -P

Create a snapshot from current system dataset

# Assuming current system dataset is zpool/root/archlinux-sxu
# where '-sxu' is a random suffix to differentiate datasets
# and has no real meaning
zfs snapshot zpool/root/archlinux-sxu@rekey

Within same pool send/receive snapshot
```
zfs send \
    --large-block \
    --compressed \
    'zpool/root/archlinux-sxu@rekey' | \

zfs receive \
    -Fvu \
    -o 'encryption=on' \
    -o 'keyformat=passphrase' \
    -o 'keylocation=file:///etc/zfs/zpool.key' \
    -o 'mountpoint=/' \
    -o 'canmount=noauto' \
    -o 'org.zfsbootmenu:commandline=rw nowatchdog rd.vconsole.keymap=de-latin1' \
    'zpool/root/archlinux-frn'
```
Explanation:
- We specifically don't zfs send -R (--replicate). While it would normally be nice to transfer all of a dataset's children at once such as all of its snapshots the -R argument conflicts with the encryption property. See comment by Tom Caputi on GitHub openzfs/zfs issue 10507 from June 2020 for details. Basically if encryption is set then -R doesn't work. We could transfer existing encryption properties with -w/--raw but we don't actually want to transfer encryption properties at all. We want them to change during transfer, see the bullet point four points down from here talking about encryption.
- We zfs receive -F destroying any target snapshots and file systems beyond the snapshot we're transferring. In this example the target zpool/root/archlinux-frn doesn't even exist so -F isn't necessary to clean anything up. It's just good practice.
- With -v we get verbose progress output
- Argument -u makes sure the dataset does not get mounted after transfer. ZFS would mount it into / which wouldn't be helpful since we're currently using that filesystem ourselves.
- We set encryption properties keyformat, keylocation and most importantly encryption. The latter will turn our transferred dataset into its own encryptionroot which in turn generates a new master key. The auto-generated new master key gets wrapped with our updated passphrase in keylocation. This basically reencrypts all data in this dataset during transfer.
- We set mountpoint and canmount as well as an org.zfsbootmenu:commandline as we would for any new system dataset.
Change zpool's bootfs property to new system dataset
```
zpool set bootfs=zpool/root/archlinux-frn zpool
```
Boot into new system dataset
After reboot and now that you're in the new system dataset change its encryptionroot by letting it inherit data from its parent:
```
zfs change-key -i -l zpool/root/archlinux-frn
```
The parent zpool/root is inheriting this property from zpool which will make sure that zpool/root/archlinux-frn essentially gets its key now from zpool. Both zpool/root/archlinux-frn and zpool use the same exact keylocation with identical content. This operation is instant.

Finishing touches

Confirm master key change

Just to confirm that the master key has changed run this command. It takes a moment to output data:

zfs send --raw zpool/root/archlinux-frn@rekey | zstream dump | sed -n -e '/crypt_keydata/,/end crypt/p; /END/q'

Repeat for source dataset zpool/root/archlinux-sxu@rekey. You're particularly interested in parameters DSL_CRYPTO_MASTER_KEY_1 and the initialization vector DSL_CRYPTO_IV. Notice that they differ between old and new dataset confirming that your new dataset has a new master key.

Clean-up

Clean up:

In newly keyed/reencrypted system dataset destroy its snapshot
```
zfs destroy zpool/root/archlinux-frn@rekey
```

Recursively destroy source dataset

zfs destroy -r zpool/root/archlinux-sxu

Unmap/TRIM

Next up unmap/TRIM unallocated disk areas. If your zpool runs on an entire disk and not just on a partition, and if your disk supports TRIM you're going to want to do:

zpool trim --secure zpool

The next best alternative is to instead do:

zpool initialize zpool

View status with either one of:

# With TRIM status
zpool status -t zpool

# Without TRIM status
zpool status zpool

ZFS setup explained

Overview

The ZFS pool and dataset setup that makes this tick, explained in plain English.

Create zpool with options:
1. -R /mnt (aka -o cachefile=none -o altroot=/mnt). The pool is never cached, i.e. it's considered temporary. All pool and dataset mount paths have /mnt prepended. From man zpoolprops:
  
  This can be used when examining an unknown pool where the mount points cannot be trusted, or in an alternate boot environment, where the typical paths are not valid. altroot is not a persistent property. It is valid only while the system is up.
2. -O canmount=off: Note the capital -O which makes this a file system property, not a pool property. File system cannot be mounted, and is ignored by zfs mount -a. This property is not inherited.
3. -O mountpoint=none: What it says on the tin, the pool has no mountpoint configured.
4. -O encryption=on: Makes this our encryptionroot and passes the encryption setting to all child datasets. Selecting encryption=on when creating a dataset indicates that the default encryption suite will be selected, which is currently aes-256-gcm.
5. -O keylocation=file://...: This property is only set for encrypted datasets which are encryption roots. Controls where the user's encryption key will be loaded from by default for commands such as zfs load-key.
6. -O keyformat=passphrase: Controls what format the user's encryption key will be provided as. Passphrases must be between 8 and 512 bytes long.
At this time the newly created zpool is not mounted anywhere. Next we create the "root" dataset, that's an arbitary term for the parent dataset of all boot environments. Boot environments in your case may be for example different operating systems all of which live on separate datasets underneath the root.
1. -o mountpoint=none: Same as above, the root dataset has - just like the pool - no mountpoint configured.
2. zfs set org.zfsbootmenu:commandline=...: Set a common kernel command line for all boot environments such as "ro quiet".
Neither the root dataset nor the pool are mounted at this time. We now create one boot environment dataset where we want to install Arch Linux.
1. -o mountpoint=/: Our Arch Linux dataset will be mounted at /.
2. -o canmount=noauto: When set to noauto, a dataset can only be mounted and unmounted explicitly. The dataset is not mounted automatically when the dataset is created or imported, nor is it mounted by the zfs mount -a command or unmounted by the zfs unmount -a command.
3. We then zpool set bootfs="zpool/root/archlinux" zpool: ZFSBootMenu uses the bootfs property to identify suitable boot environments. If only one pool has it - as is the case here - it identifies the pool's preferred boot dataset that will be booted with a 10-second countdown allowing manual interaction in ZFSBootMenu.
4. We explicitly mount the boot environment. Since the entire pool is still subject to our initial -R /mnt during creation a zfs mount zpool/root/archlinux will mount the Arch Linux dataset not into / but instead into /mnt.
We also create a data dataset that - at least for now - we use to store only our /home data.
1. For zpool/data:
  1. -o mountpoint=/: We use the mountpoint property here only for inheritance.
  2. -o canmount=off: The zpool/data dataset itself cannot actually be mounted.
2. For a zpool/data/home child dataset:
  1. We do not specify any properties. Since canmount cannot be inherited the parent's canmount=off does not apply, it instead defaults to canmount=on. The parent's mountpoint=/ property on the other hand is inherited so for a home child dataset it conveniently equals mountpoint=/home.
  2. In effect this zpool/data/home dataset is subject to zfs mount -a and will happily automount into /home.
We export the zpool once, we then reimport it by scanning only inside /dev/disk/by-partuuid, again setting -R /mnt as we did during pool creation a moment ago and we do not mount any file systems.
We zfs load-key <encryptionroot> which will load the key from keylocation after which the keystatus property for <encryptionroot> and all child datasets will change from unavailable to available.
We mount our Arch Linux boot environment dataset. It automatically gets prefixed with -R /mnt since that's how we imported the pool.
We zfs mount -a which automounts zpool/data/home into /home, which again gets auto-prepended by /mnt.
We lastly mount our EFI partition into /mnt/efi.
We instruct ZFS to save its pool configuration via zpool set cachefile=/etc/zfs/zpool.cache zpool.

The complete ZFS structure now exists and is mounted at /mnt ready for any pacstrap, debootstrap, dnf --installroot or other bootstrapping action.

Adding another boot environment-independent dataset

Assume that in addition to your /home data which lives on zpool/data/home you want another dataset that is exempt from Arch Linux snapshots.

Consider an example /opt/git directory where a bunch of Git repos are checked out on which you work. You don't want them to be snapshotted - and rolled back - when something goes sideways: they are decoupled from everything else that goes on on your machine so you can easily and safely have a static /opt/git directory available in all boot environments.

Move your current /opt/git data out of the way for a moment:

mv '/opt/git'{,'.bak'}

Create datasets

zfs create -o canmount=off zpool/data/opt
zfs create zpool/data/opt/git

Remember that the zpool/data dataset already exists and that it has both mountpoint=/ and canmount=off set. It is not and cannot be mounted itself, it instead conveniently anchors datasets at /. Since the canmount dataset property cannot be inherited and defaults to canmount=on we have to manually specify -o canmount=off. Our new zpool/data/opt should not automatically mount into /opt.

We then create the child dataset zpool/data/opt/git, it defaults to canmount=on thus immediately shows up at /opt/git.

Move data back into place and clean up temp directory

rsync -av --remove-source-files '/opt/git'{'.bak',}'/'
find '/opt/git.bak' -type d -empty -delete

An example zpool/data dataset may now look like so:

# zfs list -r -oname,mountpoint,canmount zpool/data
NAME                MOUNTPOINT  CANMOUNT
zpool/data          /           off
zpool/data/home     /home       on
zpool/data/opt      /opt        off
zpool/data/opt/git  /opt/git    on

Nested environment-independent datasets

Caution

If you want a dedicated dataset for a directory that lives deeper in your file system tree than just /opt/git, for example like /var/lib/docker make sure to not recursively create this structure in a single zfs create command.

In Adding another boot environment-independent dataset above you can safely do:

zfs create -o canmount=off zpool/data/opt

Here zpool/data already exists, you're only creating one child dataset opt and you're setting -o canmount=off so that it never mounts into your /opt directory.

Now consider the same setup for /var/lib/docker. If you follow the exact same approach:

zfs create -o canmount=off zpool/data/var/lib

Docker will correctly report:

cannot create 'zpool/data/var/lib': parent does not exist

You might want to just create the parent then with -p argument:

zfs create -p -o canmount=off zpool/data/var/lib
           ~~

Note, however, that -o canmount=off only applies to lib dataset and that zpool/data/var has just been auto-mounted into /var:

# zfs list -r -oname,mountpoint,canmount,mounted zpool/data
NAME                MOUNTPOINT  CANMOUNT  MOUNTED
zpool/data          /           off       no
zpool/data/home     /home       on        yes
zpool/data/opt      /opt        off       no
zpool/data/opt/git  /opt/git    on        yes
zpool/data/var      /var        on        yes  <---
zpool/data/var/lib  /var/lib    off       no

Advice

Instead create nested parents in multiple steps where you set each one to -o canmount=off:

zfs create -o canmount=off zpool/data/var
zfs create -o canmount=off zpool/data/var/lib

Lastly create the dataset you want mounted:

zfs create zpool/data/var/lib/docker

Development

Conventional commits

This project uses Conventional Commits for its commit messages.

Commit types

Commit types besides fix and feat are:

build: Project structure, directory layout, build instructions for roll-out
refactor: Keeping functionality while streamlining or otherwise improving function flow
test: Working on test coverage
docs: Documentation for project or components

Commit scopes

The following scopes are known for this project. A Conventional Commits commit message may optionally use one of the following scopes or none:

iso: Changing Arch Linux ISO CD
zbm: Adjusting ZFSBootMenu's behavior
zfs: A change to how ZFS interacts with the system, either a pool or a dataset
os: Getting an perating system set up to correctly work in a ZFS boot environment
meta: Affects the project's repo layout, readme content, file names etc.

Credits

Most of what's here was shamelessly copied and slightly adapted for personal use from Jonathan Kirszling at GitHub.

Thanks to:

Jonathan Kirszling:
- github.com/eoli3n/arch-config/tree/master/scripts/zfs/install
- github.com/eoli3n/archiso-zfs
Maurizio Oliveri:
- github.com/Soulsuke/arch-zfs-tools
- gist.github.com/Soulsuke/6a7d1f09f7fef968a2f32e0ff32a5c4c
Zach Dykstra, Andrew J. Hesford and all other ZFSBootMenu contributors:
- Their ZFSBootMenu testing helper scripts (chroot-arch.sh, install-arch.sh)
github.com/kongkrit:
- gist.github.com/kongkrit/a0585e179e33c2adf92db4050ec5171d

26 KiB Raw Blame History

arch-zbm

Prep

How to prep

ZFS dataset layout

How to run this?

Options

Compression

Encryption

Steps

Flavor choices

Post-run manual steps

Password change

Steps

Boot flow

Caveats in a password change

Changing master key

Finishing touches

Confirm master key change

Clean-up

Unmap/TRIM

ZFS setup explained

Overview

Adding another boot environment-independent dataset

Nested environment-independent datasets

Caution

Advice

Development

Conventional commits

Commit types

Commit scopes

Credits

26 KiB

Raw Blame History