Arch Linux ZFS-on-Root: Implementation Guide

Deploying Arch Linux with ZFS as the root filesystem offers advanced data management features such as native encryption, atomic snapshots, and transparent compression. However, because ZFS is not part of the upstream Linux kernel due to licensing incompatibilities (CDDL vs. GPL), the implementation requires specific external modules and careful bootloader integration.

This guide details the procedural workflow for a ZFS-on-Root setup using the archzfs repository and EFI boot stubs or unified kernel images.

1. Concepts and Architectural Constraints

The License Divergence

ZFS is maintained as an out-of-tree module (OpenZFS). In the Arch Linux ecosystem, this means the kernel and the ZFS module must be synchronized. If the kernel updates before a compatible ZFS module is available, the system may fail to mount the root partition upon reboot.

Pool Layout Strategy

A standard ZFS-on-Root deployment typically involves:

  • EFI System Partition (ESP): A FAT32 partition for the bootloader and kernel. ZFS cannot natively host the EFI stub.
  • zpool: A storage pool encompassing the rest of the disk, partitioned into datasets (e.g., for root, home, and logs).

2. Prerequisites

  • Live Environment with ZFS Support: The standard Arch Linux ISO does not include ZFS. You must use a custom ISO or manually install the ZFS user-space tools in the live environment by adding the archzfs repository to the pacman configuration.
  • Hardware Compatibility: Ensure the target system supports UEFI. Legacy BIOS setups are significantly more complex for ZFS and are considered deprecated for this guide.
  • Data Backup: ZFS initialization involves full disk re-partitioning. Existing data will be destroyed.

3. Implementation Workflow

Phase A: Partitioning and Pool Creation

The disk must be initialized with a GPT label. A primary partition (usually 512MiB to 1GiB) is assigned the EFI System type. The remaining space is designated for the ZFS pool.

When creating the pool, use persistent device identifiers (found in /dev/disk/by-id/) rather than volatile nodes like /dev/sda. Specific pool properties are required for boot compatibility:

  • ashift=12: Optimizes for 4K sector drives.
  • atime=off: Reduces unnecessary write operations.
  • acltype=posixacl: Required for proper Linux permission handling.

Phase B: Dataset Hierarchy

Create a hierarchical structure to separate the operating system from user data. This allows for snapshots of the system state without affecting the user's home directory.

  1. Create a root dataset container with mounting disabled.
  2. Create the active system dataset (e.g., zpool/ROOT/arch).
  3. Create secondary datasets for /home, /var/log, and /var/cache.

Set the mountpoint property to none for the pool itself and manage the mountpoints manually or via the canmount property during the installation phase.

Phase C: System Installation

Mount the datasets under /mnt. The EFI partition should be mounted at /mnt/boot or /mnt/efi. Use the pacstrap script to install the base system.

Crucial Step: You must add the archzfs repository to the new system's /etc/pacman.conf before installing the kernel. Instead of the standard linux package, it is often safer to use linux-lts to ensure better stability with OpenZFS module releases.

Phase D: Initial Ramdisk (initramfs) Configuration

The mkinitcpio configuration requires the addition of the zfs hook. This hook must be placed before the filesystems hook but after the keyboard hook. This ensures the ZFS module is loaded and the pool is imported before the kernel attempts to switch to the root filesystem.

Phase E: Bootloader Integration

The bootloader must pass the zfs=zpool/ROOT/arch parameter to the kernel.

  • systemd-boot: Create a loader entry specifying the kernel, the initrd, and the ZFS boot parameters.
  • Unified Kernel Images (UKI): Embed the kernel parameters directly into the signed EFI binary for a more secure and streamlined boot process.

4. Maintenance and Recovery

Kernel Update Hazards

To prevent a broken system, use a pacman hook to ignore kernel updates if the corresponding ZFS module is not yet present in the archzfs repository. Alternatively, use the zfs-dkms package to rebuild the module locally for every kernel update, though this requires more processing time and installed headers.

Emergency Import

If the system fails to boot, boot from a ZFS-compatible live ISO. Import the pool using the zpool import -f -R /mnt <pool_name> command. This forces the import and sets the alternative root to allow for chrooting and repairs.

5. Limitations and Risks

  • Memory Usage: ZFS employs the Adaptive Replacement Cache (ARC), which defaults to using 50% of system RAM. On systems with limited memory, this must be capped via kernel parameters.
  • Swap Files: Swapping to a ZFS dataset is unstable and can lead to deadlocks. Use a dedicated swap partition or a ZVOL with specifically tuned properties if swap is required.
  • Encryption Performance: While ZFS native encryption is robust, it occurs at the dataset level. If encryption is required for the entire pool including metadata, LUKS underlying the ZFS pool is an alternative, albeit with a performance penalty.

Summary of Options

Component Recommended Alternative
Kernel linux-lts linux (Latest)
ZFS Package zfs-linux-lts zfs-dkms
Bootloader systemd-boot GRUB
Partitioning GPT / EFI Legacy MBR (Not recommended)