MergerFS + SnapRAID: Flexible Union Storage for Your Homelab

storage 2026-02-14 · 9 min read mergerfs snapraid storage data-protection

ZFS and Btrfs get all the attention in homelab storage discussions. They're great — if your drives are the same size, you buy them all at once, and you're comfortable with their specific management models. But many homelabbers accumulate drives over time. A 4TB from three years ago, an 8TB from last year, a 12TB you got on sale. ZFS makes mixing drive sizes painful. Btrfs RAID5/6 is still not recommended for production use.

MergerFS + SnapRAID is the answer for the rest of us. MergerFS pools your drives into a single mountpoint. SnapRAID adds parity protection on top. You can mix any drive sizes, add drives whenever you want, and each drive remains a standalone filesystem — if everything goes wrong, you can plug any individual drive into another machine and read its files directly.

MergerFS and SnapRAID storage architecture overview

This combination has been the backbone of media server homelabs for years, and for good reason. Let's set it up properly.

How MergerFS + SnapRAID Work Together

These are two separate tools that complement each other:

MergerFS: Union Filesystem

MergerFS takes multiple individual filesystems (drives) and presents them as a single merged mountpoint using FUSE. It's not RAID — it doesn't stripe or mirror data. Each file lives on exactly one physical drive. MergerFS just decides which drive to put new files on based on policies you configure.

Drive 1 (/mnt/disk1): movies/a-through-m/
Drive 2 (/mnt/disk2): movies/n-through-z/
Drive 3 (/mnt/disk3): tv-shows/

MergerFS mount (/mnt/storage):
  movies/a-through-m/    (from disk1)
  movies/n-through-z/    (from disk2)
  tv-shows/              (from disk3)

Users and applications see a single /mnt/storage with all the files.

SnapRAID: Snapshot Parity

SnapRAID calculates parity data across your drives, like RAID5/6, but on a snapshot basis. You run snapraid sync periodically (daily via cron), and it calculates parity for any files that changed. If a drive fails, you use the parity drive to reconstruct its data.

Key difference from traditional RAID: SnapRAID doesn't protect against data loss between syncs. If you write a file at 3 PM and the drive dies at 4 PM before the nightly sync runs, that file has no parity protection. This makes it ideal for media libraries (write once, read many) and less ideal for databases or constantly-changing data.

Why This Combination?

Feature	MergerFS + SnapRAID	ZFS RAIDZ	Btrfs RAID
Mix drive sizes	Yes	Possible but wasteful	Possible but risky
Add drives anytime	Yes, trivially	Yes, but must add vdevs	Yes
Individual drive recovery	Yes	No	No
Real-time parity	No (snapshot)	Yes	Yes
CPU overhead	Minimal	Medium (checksumming)	Low
RAM requirements	Minimal	1GB per TB recommended	Low
Filesystem choice	Anything (ext4, XFS)	ZFS only	Btrfs only

Prerequisites

You need:

2+ data drives of any size and type (HDD recommended for bulk storage)
1+ parity drive that is at least as large as your largest data drive
A Linux system (any distro works)
ext4 or XFS formatted drives (both work well; ext4 is the most common choice)

For this guide, we'll use:

3 data drives: 4TB, 8TB, 12TB
1 parity drive: 12TB (must be >= largest data drive)

Step 1: Prepare the Drives

Format each drive individually. Don't use LVM, don't use mdadm. Plain partitions with ext4 or XFS:

# Identify your drives
lsblk -o NAME,SIZE,MODEL

# Partition each drive (example for /dev/sdb)
sudo parted /dev/sdb mklabel gpt
sudo parted /dev/sdb mkpart primary ext4 0% 100%

# Format with ext4
sudo mkfs.ext4 -m 0 -T largefile4 /dev/sdb1
sudo mkfs.ext4 -m 0 -T largefile4 /dev/sdc1
sudo mkfs.ext4 -m 0 -T largefile4 /dev/sdd1
sudo mkfs.ext4 -m 0 -T largefile4 /dev/sde1  # parity drive

The -m 0 flag disables reserved blocks (unnecessary for data drives). -T largefile4 optimizes for large media files.

Create Mount Points

sudo mkdir -p /mnt/disk{1,2,3}
sudo mkdir -p /mnt/parity1
sudo mkdir -p /mnt/storage  # MergerFS mount point

Set Up fstab

Use UUIDs (not device names) so drives mount correctly regardless of detection order:

# Get UUIDs
blkid /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1

Add to /etc/fstab:

# Data drives
UUID=xxxx-xxxx /mnt/disk1   ext4 defaults 0 2
UUID=yyyy-yyyy /mnt/disk2   ext4 defaults 0 2
UUID=zzzz-zzzz /mnt/disk3   ext4 defaults 0 2

# Parity drive
UUID=pppp-pppp /mnt/parity1 ext4 defaults 0 2

Mount everything:

sudo mount -a

Step 2: Install MergerFS

From Package Manager

# Ubuntu/Debian
sudo apt install mergerfs

# Fedora
sudo dnf install mergerfs

# Arch (AUR)
yay -S mergerfs

From GitHub Releases (Recommended for Latest Version)

MergerFS releases frequently and distro packages can be outdated:

# Download the latest .deb or .rpm from:
# https://github.com/trapexit/mergerfs/releases

# Example for Debian/Ubuntu
wget https://github.com/trapexit/mergerfs/releases/download/2.40.2/mergerfs_2.40.2.debian-bookworm_amd64.deb
sudo dpkg -i mergerfs_2.40.2.debian-bookworm_amd64.deb

Configure MergerFS

Add to /etc/fstab:

/mnt/disk1:/mnt/disk2:/mnt/disk3 /mnt/storage mergerfs defaults,allow_other,use_ino,cache.files=partial,dropcacheonclose=true,category.create=mfs,fsname=mergerfs 0 0

Let's break down the key options:

allow_other: Lets all users access the mount (needed for Samba/NFS)
use_ino: Improves inode handling for better compatibility
cache.files=partial: Enables page caching for better read performance
dropcacheonclose=true: Frees cache when files are closed (saves RAM)
category.create=mfs: The placement policy for new files

File Placement Policies

The category.create option controls which drive new files land on. Common policies:

mfs (most free space): New files go on the drive with the most free space. Good default — naturally balances your drives.
lfs (least free space): Fills drives one at a time. Useful if you want to minimize spin-up of multiple drives.
epmfs (existing path, most free space): If the parent directory exists on a drive, use that drive. Otherwise, pick the drive with the most free space. Best for keeping related files on the same drive.
rand: Random placement. Simple but doesn't balance well.

For media homelabs, epmfs is often the best choice. It keeps TV show seasons together on the same drive, which improves sequential read performance and keeps things tidy.

Mount it:

sudo mount /mnt/storage

Verify:

df -h /mnt/storage
# Should show combined space of all data drives

ls /mnt/storage
# Should show merged view of all drives

Step 3: Install and Configure SnapRAID

Installation

# Ubuntu/Debian
sudo apt install snapraid

# Fedora (may need COPR or build from source)
sudo dnf install snapraid

# From source
git clone https://github.com/amadvance/snapraid.git
cd snapraid
./configure
make
sudo make install

Configuration

Create /etc/snapraid.conf:

# Parity file(s) — one per parity drive
parity /mnt/parity1/snapraid.parity

# Content files — SnapRAID's database of file checksums
# Store on multiple drives for redundancy (at least 2 locations)
content /var/snapraid/snapraid.content
content /mnt/disk1/snapraid.content
content /mnt/disk2/snapraid.content

# Data drives — order matters! Don't change after initial sync
data d1 /mnt/disk1/
data d2 /mnt/disk2/
data d3 /mnt/disk3/

# Exclude files that change frequently or aren't worth protecting
exclude *.unrecognized
exclude /tmp/
exclude /lost+found/
exclude .Trash-*/
exclude *.part
exclude aquota.group
exclude aquota.user

Create the content directory:

sudo mkdir -p /var/snapraid

Initial Sync

The first sync calculates parity for all existing data. This takes a long time (hours for terabytes of data):

sudo snapraid sync

You can monitor progress — it shows files processed and estimated time remaining. Let it finish completely.

Verify Data Integrity

After the initial sync, run a scrub to verify checksums:

# Scrub the oldest 8% of data (default)
sudo snapraid scrub

# Or scrub everything (slow, but good for initial verification)
sudo snapraid scrub -p 100

Step 4: Automate with Cron/Systemd

The Sync-Scrub Runner Script

You don't want to run snapraid sync blindly. If a drive fails and all files appear "deleted," a naive sync would update parity to reflect the empty drive — destroying your ability to recover. Use a runner script with safety checks:

#!/bin/bash
# /usr/local/bin/snapraid-runner.sh

LOG="/var/log/snapraid-runner.log"
CONF="/etc/snapraid.conf"
DELETE_THRESHOLD=50  # Abort if more than this many deletes detected

echo "=== SnapRAID Runner - $(date) ===" >> "$LOG"

# Run a diff first to see what changed
DIFF_OUTPUT=$(snapraid diff 2>&1)
echo "$DIFF_OUTPUT" >> "$LOG"

# Count deletions
DELETED=$(echo "$DIFF_OUTPUT" | grep -c "^remove")

if [ "$DELETED" -gt "$DELETE_THRESHOLD" ]; then
    echo "ERROR: $DELETED deletions detected, exceeds threshold of $DELETE_THRESHOLD. Aborting sync." >> "$LOG"
    echo "Possible drive failure. Investigate before syncing." >> "$LOG"
    # Send notification (adapt to your notification system)
    # curl -s -d "SnapRAID: $DELETED deletions detected, sync aborted" ntfy.sh/your-topic
    exit 1
fi

# Run sync
echo "Running sync..." >> "$LOG"
snapraid sync >> "$LOG" 2>&1

if [ $? -ne 0 ]; then
    echo "ERROR: Sync failed" >> "$LOG"
    exit 1
fi

# Run scrub (oldest 5% of data)
echo "Running scrub..." >> "$LOG"
snapraid scrub -p 5 -o 30 >> "$LOG" 2>&1

echo "=== Complete ===" >> "$LOG"

sudo chmod +x /usr/local/bin/snapraid-runner.sh

Schedule It

# /etc/systemd/system/snapraid-runner.timer
[Unit]
Description=Daily SnapRAID sync and scrub

[Timer]
OnCalendar=*-*-* 03:00:00
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target

# /etc/systemd/system/snapraid-runner.service
[Unit]
Description=SnapRAID sync and scrub
After=mnt-disk1.mount mnt-disk2.mount mnt-disk3.mount mnt-parity1.mount

[Service]
Type=oneshot
ExecStart=/usr/local/bin/snapraid-runner.sh
Nice=19
IOSchedulingClass=idle

sudo systemctl enable --now snapraid-runner.timer

The Nice=19 and IOSchedulingClass=idle ensure SnapRAID doesn't compete with your services for disk I/O.

Drive Failure Recovery

Scenario: A Data Drive Dies

This is what SnapRAID is for. Say disk2 fails:

Don't sync. Do not run snapraid sync. This would update parity to reflect the missing data.
Replace the drive. Install a new drive, format it, mount it at /mnt/disk2.
Fix the data:

sudo snapraid fix -d d2

This reconstructs all files that were on disk2 using parity data from the parity drive and data from the surviving drives.

Verify the recovery:

sudo snapraid check -d d2

Sync to update parity with the recovered data:

sudo snapraid sync

Scenario: The Parity Drive Dies

Replace it, mount it at /mnt/parity1, and re-sync:

sudo snapraid sync

SnapRAID recalculates all parity data. No data loss since parity isn't user data.

Scenario: Two Drives Die Simultaneously

With a single parity drive, you can survive one drive failure. If you need two-drive redundancy, add a second parity drive:

# /etc/snapraid.conf
parity /mnt/parity1/snapraid.parity
2-parity /mnt/parity2/snapraid.parity

This is equivalent to RAID6. Your parity drives must each be >= your largest data drive.

Adding More Drives

This is where MergerFS + SnapRAID shines compared to traditional RAID:

Physically install the new drive
Format and mount it:

sudo mkfs.ext4 -m 0 -T largefile4 /dev/sdf1
sudo mkdir /mnt/disk4
sudo mount /dev/sdf1 /mnt/disk4

Add to fstab (with UUID)
Add to MergerFS — update the fstab line:

/mnt/disk1:/mnt/disk2:/mnt/disk3:/mnt/disk4 /mnt/storage mergerfs defaults,...

Remount:

sudo umount /mnt/storage
sudo mount /mnt/storage

Add to SnapRAID config:

data d4 /mnt/disk4/

Sync:

sudo snapraid sync

That's it. No rebuild, no reshuffling, no downtime. The new drive is immediately available for new files and protected by parity after the next sync.

MergerFS storage pool expanded with new drive

Performance Tuning

MergerFS Caching

For media streaming, the default settings are fine. For more write-heavy workloads, consider:

# In fstab mergerfs options
cache.files=auto-full   # More aggressive caching
cache.readdir=true      # Cache directory listings
cache.statfs=true       # Cache filesystem stats

SSD Cache Drive

You can add an SSD as a "hot" drive that receives all new writes, with a script that periodically moves files to the slower HDDs:

# MergerFS with SSD tier
/mnt/ssd:/mnt/disk1:/mnt/disk2:/mnt/disk3 /mnt/storage mergerfs defaults,...,category.create=ff

The ff (first found) policy writes to the first drive listed — your SSD. Then a nightly script moves aged files to the HDDs:

#!/bin/bash
# Move files older than 1 day from SSD to HDDs
find /mnt/ssd -type f -mtime +1 -exec rsync -axqHAXWES --remove-source-files {} /mnt/disk1/ \;

This gives you SSD write performance for new content with HDD bulk storage for everything else.

Monitoring

SMART Health

Monitor your drives with smartmontools:

sudo apt install smartmontools

# Quick health check
sudo smartctl -H /dev/sdb
sudo smartctl -H /dev/sdc
sudo smartctl -H /dev/sdd

# Detailed report
sudo smartctl -a /dev/sdb

Set up automated SMART monitoring:

# /etc/smartd.conf
/dev/sdb -a -o on -S on -s (S/../.././02|L/../../6/03) -m [email protected]
/dev/sdc -a -o on -S on -s (S/../.././02|L/../../6/03) -m [email protected]
/dev/sdd -a -o on -S on -s (S/../.././02|L/../../6/03) -m [email protected]
/dev/sde -a -o on -S on -s (S/../.././02|L/../../6/03) -m [email protected]

SnapRAID Status

# Quick status
snapraid status

# Shows: array health, data/parity state, files protected, scrub status

Disk Space

# Per-drive usage
df -h /mnt/disk{1,2,3}

# Merged pool usage
df -h /mnt/storage

When NOT to Use MergerFS + SnapRAID

Be honest about the limitations:

Databases: Snapshot parity doesn't protect constantly-changing data. Use ZFS or proper backups for databases.
VMs/containers: VM disk images change constantly. SnapRAID sync would be slow and protect poorly. Use ZFS or dedicated SSD storage.
Real-time redundancy: If you can't tolerate any data loss window, you need real RAID or ZFS mirrors.
Small setups: If you have 2 drives, just mirror them with mdadm. MergerFS + SnapRAID shines at 3+ data drives.

The sweet spot is media libraries, backups, document archives — data that's written once and read many times. That describes most homelab storage workloads.

Wrapping Up

MergerFS + SnapRAID is the most practical storage solution for homelabs that grow organically. You don't need matching drives. You don't need to plan capacity years in advance. You add drives as you find good deals, and the system absorbs them gracefully.

The trade-off — snapshot parity instead of real-time — is acceptable for the vast majority of homelab data. Your media library, your backups, your document archives: they're all write-once-read-many workloads where a nightly parity sync is perfectly adequate.

Set it up, automate the sync, monitor your SMART data, and you've got a flexible, expandable, recoverable storage pool that'll serve you for years.