ZFS Basics for Homelab Storage
ZFS is a filesystem designed for data integrity at scale. Unlike ext4 or NTFS which rely on the OS for RAID and volume management, ZFS integrates all three: filesystem, volume manager, and RAID controller in one stack. For homelab NAS use, ZFS is the standard choice: it checksums all data, detects and corrects silent corruption, and makes snapshots trivial.
Core Concepts
Pool (zpool): The top-level storage container. Consists of one or more VDEVs. A pool provides raw storage that datasets are carved from.
VDEV: A virtual device within a pool. The VDEV type determines redundancy:
- Single disk: No redundancy
- Mirror: Like RAID-1, two or more disks mirror each other
- RAIDZ1: Distributed parity (similar to RAID-5), can lose 1 disk
- RAIDZ2: Two-disk fault tolerance (similar to RAID-6), can lose 2 disks
- RAIDZ3: Three-disk fault tolerance, can lose 3 disks
Dataset: A filesystem within a pool. Inherits pool properties but can have its own settings (compression, quotas, record size). Think of datasets as directories with extra powers.
Snapshot: A point-in-time copy of a dataset. Instantaneous, space-efficient (only stores changes), and can be rolled back or cloned.
Scrub: A background integrity check that reads all data and verifies checksums. Run monthly to detect bit rot.
Installation
# Ubuntu/Debian
sudo apt install zfsutils-linux
# On Proxmox: ZFS is included in the kernel by default
Creating a Pool
# Mirror pool (2 disks, requires /dev/sdb and /dev/sdc)
sudo zpool create tank mirror /dev/sdb /dev/sdc
# RAIDZ2 with 4 disks
sudo zpool create tank raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde
# Mirror with separate L2ARC (SSD cache) and ZIL (SSD log)
sudo zpool create tank mirror /dev/sdb /dev/sdc \
cache /dev/nvme0n1p1 \
log mirror /dev/nvme0n1p2 /dev/nvme1n1p1
# Check pool status
zpool status
zpool list
Best practice: Use disk IDs instead of /dev/sdX to avoid device letter changes after reboot:
ls -la /dev/disk/by-id/ | grep -v part
# Use names like /dev/disk/by-id/ata-ST4000DM004-...-0001
Like what you're reading? Subscribe to HomeLab Starter — free weekly guides in your inbox.
VDEV Selection Guide
| Array size | Recommendation | Reason |
|---|---|---|
| 2 disks | Mirror | Simple, fast, can lose 1 |
| 4-5 disks | RAIDZ2 | Lose 2 disks, good efficiency |
| 6-8 disks | RAIDZ2 or two mirrors | RAIDZ2 maximizes capacity; mirrors maximize IOPS |
| 8+ disks | RAIDZ3 or multiple VDEVs | Larger arrays benefit from more redundancy |
RAIDZ doesn't expand: Once you create a RAIDZ VDEV, you can't add disks to it. Plan your layout upfront. You can add another VDEV to the pool (pool expansion), but each VDEV is fixed.
For homelab use, mirrors are often preferable: faster rebuilds, can add one disk at a time (expand by converting to mirror), and better IOPS than RAIDZ.
Datasets
Create datasets to organize data and apply per-dataset settings:
# Create datasets
zfs create tank/media
zfs create tank/backups
zfs create tank/vms
# List datasets
zfs list
# Dataset properties
zfs get all tank/media | grep -E 'compress|record|quota'
Compression (almost always enable):
# Enable LZ4 compression (fast, good ratio)
zfs set compression=lz4 tank
# Or per-dataset
zfs set compression=zstd tank/backups # Better ratio, slightly slower
Record size tuning:
# For large sequential files (video, backups)
zfs set recordsize=1M tank/media
# For databases (match DB block size)
zfs set recordsize=8k tank/postgres # PostgreSQL default block is 8k
# For VMs (Proxmox recommends 16k for VM disks)
zfs set recordsize=16k tank/vms
Quotas:
# Limit dataset to 2TB
zfs set quota=2T tank/media
# Reservation: guarantee minimum space
zfs set reservation=100G tank/critical-data
Snapshots
Snapshots are one of ZFS's best features. They're instantaneous, space-efficient, and allow point-in-time recovery:
# Create snapshot
zfs snapshot tank/data@2026-03-04
# List snapshots
zfs list -t snapshot
# Roll back to snapshot (destroys changes since snapshot)
zfs rollback tank/data@2026-03-04
# Access snapshot contents (read-only)
ls /tank/data/.zfs/snapshots/2026-03-04/
# Delete snapshot
zfs destroy tank/data@2026-03-04
# Clone snapshot to new dataset (writable copy)
zfs clone tank/data@2026-03-04 tank/data-clone
Automated snapshots: Use sanoid for automatic snapshot management with retention policies:
sudo apt install sanoid
# /etc/sanoid/sanoid.conf
[tank/data]
use_template = production
[template_production]
frequently = 0
hourly = 24
daily = 30
monthly = 3
yearly = 1
autosnap = yes
autoprune = yes
# Run sanoid (or via cron/systemd timer)
sudo sanoid --cron
Pool Maintenance
# Start a scrub (runs in background)
sudo zpool scrub tank
# Check scrub status
sudo zpool status tank
# Import pool (after reboot or drive change)
sudo zpool import tank
# Export pool (cleanly unmount before moving drives)
sudo zpool export tank
Schedule monthly scrubs via cron:
# /etc/cron.d/zfs-scrub
0 2 1 * * root zpool scrub tank
ARC and Memory
ZFS uses RAM for its Adaptive Replacement Cache (ARC). It's not a leak — ARC shrinks under memory pressure and releases to other processes. But for small RAM systems, you may want to cap it:
# Check current ARC stats
arc_summary # or: cat /proc/spl/kstat/zfs/arcstats | head -20
# Limit ARC to 4GB (write to /etc/modprobe.d/zfs.conf)
echo "options zfs zfs_arc_max=4294967296" > /etc/modprobe.d/zfs.conf
update-initramfs -u
Send/Receive for Replication
ZFS send/receive replicates datasets, including snapshots, to another pool:
# Initial replication
zfs send tank/data@snapshot1 | ssh backup-server zfs receive backup/data
# Incremental replication (much faster — only sends changes)
zfs send -i tank/data@snapshot1 tank/data@snapshot2 | \
ssh backup-server zfs receive backup/data
Use syncoid (part of sanoid) to automate replication:
syncoid tank/data backup-server:backup/data
This is how homelab operators do 3-2-1 backups: primary ZFS pool → local backup ZFS pool → offsite ZFS or cloud.
ZFS on Proxmox
Proxmox has first-class ZFS support. During installation, you can create a ZFS pool as the OS disk. For NAS storage:
- Proxmox UI → Datacenter → Storage → Add → ZFS
- Select pool name
- Choose dataset (optional)
- Now usable for VM disks, backups, ISO storage
For optimal VM performance on ZFS:
- Set
recordsize=16kon the dataset containing VM disks - Enable
atime=off(ZFS default on Proxmox) - Use
sync=disabledcautiously (faster but risks data on power failure)
ZFS is production-quality for homelab use. The main cost is RAM (ARC) — plan 1GB RAM per TB of storage as a guideline, though ZFS works with less.
