VM Snapshots and Rollback: Safe Experimentation in Your Homelab

Virtualization 2026-02-09 · 17 min read snapshots proxmox vmware kvm rollback

One of the best things about running VMs in a homelab is that you can break things without consequences. Want to try upgrading your kernel? Testing a new database version? Experimenting with a firewall rule that might lock you out? Take a snapshot first, and if things go sideways, roll back in seconds.

But snapshots aren't magic. They're not backups. They can eat your storage if you're not careful. And if you don't understand how they work under the hood, they can actually slow your VMs down significantly.

Let's fix that. This guide covers snapshot management across the three most common homelab hypervisors: Proxmox VE, libvirt/KVM, and VMware ESXi. By the end, you'll have automated snapshot schedules, clean rollback procedures, and a solid understanding of when to use snapshots vs. full backups.

What Snapshots Actually Are (And Aren't)

A snapshot captures the state of a virtual machine at a specific point in time. Depending on the hypervisor, this includes:

Disk state — The contents of the virtual disk(s)
Memory state — The contents of RAM (optional)
VM configuration — CPU, memory, network settings

Here's the critical distinction that trips people up:

Feature	Snapshot	Backup
Speed to create	Seconds	Minutes to hours
Storage location	Same storage as VM	Separate storage
Survives storage failure	No	Yes (if stored elsewhere)
Performance impact	Yes (grows over time)	None after completion
Purpose	Short-term rollback	Long-term data protection
Dependency	Requires base disk	Self-contained

Snapshots are not backups. I'll say it again because this is the number one mistake homelabbers make. A snapshot lives on the same storage as your VM. If that storage dies, you lose both the VM and all its snapshots. Snapshots are for short-term experimentation, not data protection.

How Snapshots Work Under the Hood

When you take a snapshot of a VM's disk, the hypervisor stops writing to the original disk image and creates a new "delta" file. All new writes go to the delta file. Reads check the delta first, then fall back to the original.

This is called copy-on-write (or redirect-on-write, depending on the implementation). The original disk becomes read-only, and the snapshot delta tracks only what's changed since the snapshot was taken.

This has some important implications:

Snapshots are fast to create because nothing is copied — just a new delta file is started
Snapshots grow over time because every write to the VM goes to the delta
Snapshot chains slow things down because reads have to traverse the chain
Deleting a snapshot isn't instant because the delta has to be merged back into the base (or parent)

Think of it like a stack of transparent overlays on a drawing. Each overlay has only the changes from the previous layer. To see the full picture, you need all the layers. The more layers you have, the longer it takes to find any given piece of data.

Proxmox VE Snapshot Management

Proxmox is probably the most popular hypervisor for homelabs, and its snapshot support is solid. You can manage snapshots through the web UI or the command line.

Taking Snapshots via the Web UI

Select your VM in the Proxmox web interface
Go to the Snapshots tab
Click Take Snapshot
Give it a meaningful name (e.g., before-kernel-upgrade-2026-02-09)
Optionally include VM memory state (RAM contents)
Click Take Snapshot

Including the RAM state makes the snapshot larger and slower to create, but it lets you restore to a running state instead of a powered-off one. For most homelab scenarios, disk-only snapshots are fine.

Taking Snapshots via CLI

The command-line approach is what you'll want for automation:

# Take a snapshot (disk only)
qm snapshot 100 before-upgrade --description "Before kernel upgrade"

# Take a snapshot with RAM
qm snapshot 100 before-upgrade --vmstate --description "Before kernel upgrade"

# List all snapshots for VM 100
qm listsnapshot 100

# Roll back to a snapshot
qm rollback 100 before-upgrade

# Delete a snapshot (merge delta into base)
qm delsnapshot 100 before-upgrade

The VM ID (100 in these examples) is the numeric ID shown in the Proxmox interface.

Proxmox Snapshot Storage Considerations

Proxmox supports multiple storage backends, and not all of them support snapshots equally:

Storage Type	Snapshot Support	Notes
ZFS	Excellent	Native ZFS snapshots, very fast, minimal overhead
LVM-thin	Good	Thin provisioning with snapshot support
Ceph/RBD	Excellent	Native snapshot support, distributed
Directory (qcow2)	Good	QEMU snapshots, works but slower
LVM (thick)	No	No snapshot support — use LVM-thin instead
NFS (qcow2)	Good	QEMU snapshots via qcow2 format
ZFS over iSCSI	Good	Native ZFS snapshots on the target

If you're setting up a new Proxmox installation and want good snapshot support, ZFS or LVM-thin are your best bets. ZFS is particularly nice because snapshots are essentially free until the data diverges.

# Check your storage configuration
pvesm status

# Example output:
# Name          Type     Status  Total    Used     Available  %
# local         dir      active  98304    12288    86016      12.50%
# local-lvm     lvmthin  active  409600   204800   204800     50.00%
# zfspool       zfspool  active  1843200  368640   1474560    20.00%

ZFS Snapshots in Proxmox

If your Proxmox VMs are on ZFS storage, you get some extra benefits:

# List ZFS snapshots directly
zfs list -t snapshot -r rpool/data

# Example output:
# NAME                                          USED  AVAIL  REFER  MOUNTPOINT
# rpool/data/vm-100-disk-0@before-upgrade       12M      -  32.5G  -
# rpool/data/vm-100-disk-0@daily-2026-02-08     48M      -  32.5G  -
# rpool/data/vm-100-disk-0@daily-2026-02-09     8K       -  32.5G  -

# Check how much space snapshots are consuming
zfs list -o name,used,refer,usedbysnapshots -r rpool/data

# Send a ZFS snapshot to another pool (great for backups!)
zfs send rpool/data/vm-100-disk-0@before-upgrade | zfs recv backup/vm-100

ZFS snapshots are instantaneous and initially take zero additional space. They only grow as the active data diverges from the snapshot. This makes ZFS the ideal storage backend for a snapshot-heavy workflow.

Libvirt/KVM Snapshot Management

If you're running KVM with libvirt (either directly on a Linux host or through something like virt-manager), you manage snapshots with the virsh command.

Internal vs. External Snapshots

Libvirt supports two types of snapshots, and this is important to understand:

Internal snapshots are stored inside the qcow2 disk image. They're easy to manage but only work with qcow2 format disks and have some performance overhead.

External snapshots create a new overlay file, leaving the original as a read-only backing file. They're faster and more flexible but slightly more complex to manage.

# Take an internal snapshot (disk + memory)
virsh snapshot-create-as myvm snap-before-upgrade \
  --description "Before kernel upgrade" \
  --atomic

# Take a disk-only internal snapshot (VM can be running or stopped)
virsh snapshot-create-as myvm snap-before-upgrade \
  --description "Before kernel upgrade" \
  --disk-only \
  --atomic

# Take an external snapshot
virsh snapshot-create-as myvm snap-before-upgrade \
  --description "Before kernel upgrade" \
  --disk-only \
  --diskspec vda,snapshot=external,file=/var/lib/libvirt/images/myvm-snap.qcow2 \
  --atomic

# List all snapshots
virsh snapshot-list myvm

# Get snapshot details
virsh snapshot-info myvm snap-before-upgrade

# Revert to a snapshot
virsh snapshot-revert myvm snap-before-upgrade

# Delete a snapshot
virsh snapshot-delete myvm snap-before-upgrade

Managing External Snapshot Chains

External snapshots can form chains that look like this:

base.qcow2 ← snap1.qcow2 ← snap2.qcow2 ← snap3.qcow2 (active)

Each file in the chain depends on everything before it. You can inspect the chain with:

# Show the backing chain for a disk image
qemu-img info --backing-chain /var/lib/libvirt/images/myvm-snap3.qcow2

# Example output:
# image: /var/lib/libvirt/images/myvm-snap3.qcow2
# file format: qcow2
# virtual size: 50 GiB (53687091200 bytes)
# disk size: 256 MiB
# backing file: /var/lib/libvirt/images/myvm-snap2.qcow2
# backing file format: qcow2
#
# image: /var/lib/libvirt/images/myvm-snap2.qcow2
# ...

To merge snapshots and reduce the chain (called "block commit"):

# Merge the active layer into its backing file (commit the top layer down)
virsh blockcommit myvm vda --active --pivot --verbose

# Or merge a specific snapshot into its parent
virsh blockcommit myvm vda --top /path/to/snap2.qcow2 --base /path/to/snap1.qcow2 --verbose

Practical KVM Snapshot Script

Here's a script I use for taking quick snapshots before doing anything risky:

#!/bin/bash
# vm-snap.sh — Quick snapshot management for libvirt VMs
set -euo pipefail

VM_NAME="${1:?Usage: vm-snap.sh <vm-name> [take|list|revert|delete] [snap-name]}"
ACTION="${2:-list}"
SNAP_NAME="${3:-manual-$(date +%Y%m%d-%H%M%S)}"

case "$ACTION" in
  take)
    echo "Taking snapshot '$SNAP_NAME' of VM '$VM_NAME'..."
    virsh snapshot-create-as "$VM_NAME" "$SNAP_NAME" \
      --description "Manual snapshot $(date)" \
      --atomic
    echo "Snapshot created. Current snapshots:"
    virsh snapshot-list "$VM_NAME"
    ;;
  list)
    virsh snapshot-list "$VM_NAME" --tree
    ;;
  revert)
    echo "Reverting VM '$VM_NAME' to snapshot '$SNAP_NAME'..."
    virsh snapshot-revert "$VM_NAME" "$SNAP_NAME"
    echo "Reverted successfully."
    ;;
  delete)
    echo "Deleting snapshot '$SNAP_NAME' from VM '$VM_NAME'..."
    virsh snapshot-delete "$VM_NAME" "$SNAP_NAME"
    echo "Deleted."
    ;;
  *)
    echo "Unknown action: $ACTION"
    echo "Usage: vm-snap.sh <vm-name> [take|list|revert|delete] [snap-name]"
    exit 1
    ;;
esac

Usage:

chmod +x vm-snap.sh

# Take a snapshot
./vm-snap.sh myvm take before-nginx-upgrade

# List snapshots
./vm-snap.sh myvm list

# Roll back
./vm-snap.sh myvm revert before-nginx-upgrade

# Clean up
./vm-snap.sh myvm delete before-nginx-upgrade

Snapshot Chains and Performance Impact

This is where most people get bitten. A single snapshot has minimal performance impact. But snapshot chains — multiple snapshots stacked on top of each other — can significantly degrade VM performance.

Why Chains Hurt Performance

Every read operation has to traverse the chain from the active layer back to the base image to find the data. With a chain of 5 snapshots, a read might need to check 6 files before finding the data.

Write operations always go to the active layer, so they aren't affected as much. But the metadata overhead still adds up.

Here's a rough guide to how chain depth affects I/O performance:

Chain Depth	Read Impact	Write Impact	Recommendation
1 (just one snapshot)	Minimal (< 5%)	Minimal	Fine for days/weeks
2-3 snapshots	Noticeable (5-15%)	Minimal	OK for short-term testing
4-5 snapshots	Significant (15-30%)	Moderate	Consolidate soon
6+ snapshots	Severe (30%+)	Moderate	Consolidate immediately

These numbers are approximate and depend heavily on your storage backend, I/O patterns, and whether the data is cached. But the trend is clear: keep your chains short.

Monitoring Snapshot Size

Snapshots grow as the VM writes new data. A snapshot that was 0 bytes when created can balloon to gigabytes if the VM is doing heavy I/O.

# Proxmox — check snapshot sizes
qm listsnapshot 100

# libvirt — check snapshot disk usage
virsh domblkinfo myvm vda
qemu-img info /var/lib/libvirt/images/myvm-snap1.qcow2

# ZFS — check snapshot space usage
zfs list -t snapshot -o name,used,refer -r rpool/data

Set up a simple monitoring script to alert you when snapshot storage gets out of hand:

#!/bin/bash
# check-snapshot-growth.sh — Alert if snapshots are using too much space
set -euo pipefail

MAX_SNAPSHOT_GB=50
STORAGE_PATH="/var/lib/libvirt/images"

# Calculate total snapshot overlay size
SNAP_SIZE_BYTES=$(find "$STORAGE_PATH" -name "*-snap*" -type f -printf "%s\n" | paste -sd+ | bc)
SNAP_SIZE_GB=$((SNAP_SIZE_BYTES / 1073741824))

if [ "$SNAP_SIZE_GB" -gt "$MAX_SNAPSHOT_GB" ]; then
  echo "WARNING: Snapshot storage is using ${SNAP_SIZE_GB}GB (threshold: ${MAX_SNAPSHOT_GB}GB)"
  echo "Consider consolidating or deleting old snapshots."
  # Optionally send a notification
  # curl -s -o /dev/null "https://ntfy.sh/my-homelab-alerts" \
  #   -d "Snapshot storage: ${SNAP_SIZE_GB}GB exceeds ${MAX_SNAPSHOT_GB}GB threshold"
fi

Automated Snapshot Schedules

Manual snapshots are great for one-off experiments, but you should also have automated snapshots for your important VMs. These act as quick rollback points for unexpected problems.

Proxmox: Built-in Backup Scheduling

Proxmox doesn't have built-in snapshot scheduling (its schedule feature is for full backups), but you can easily add it with cron:

# /etc/cron.d/vm-snapshots
# Automated daily snapshots for important VMs
# Runs at 3:00 AM, keeps last 3 daily snapshots

0 3 * * * root /usr/local/bin/proxmox-auto-snapshot.sh

#!/bin/bash
# /usr/local/bin/proxmox-auto-snapshot.sh
# Automated snapshot management for Proxmox VMs
set -euo pipefail

# VMs to snapshot (space-separated VM IDs)
VMS="100 101 102"

# How many daily snapshots to keep
KEEP_DAYS=3

DATE=$(date +%Y%m%d)
SNAP_PREFIX="auto-daily"

for VMID in $VMS; do
  SNAP_NAME="${SNAP_PREFIX}-${DATE}"

  # Check if VM exists and is not a template
  if ! qm status "$VMID" &>/dev/null; then
    echo "VM $VMID does not exist, skipping"
    continue
  fi

  echo "Creating snapshot '$SNAP_NAME' for VM $VMID..."
  qm snapshot "$VMID" "$SNAP_NAME" --description "Automated daily snapshot"

  # Clean up old snapshots
  echo "Cleaning up old snapshots for VM $VMID..."
  qm listsnapshot "$VMID" | grep "$SNAP_PREFIX" | while read -r line; do
    OLD_SNAP=$(echo "$line" | awk '{print $2}')
    # Extract date from snapshot name
    OLD_DATE=$(echo "$OLD_SNAP" | grep -oP '\d{8}$' || true)
    if [ -n "$OLD_DATE" ]; then
      AGE_DAYS=$(( ($(date +%s) - $(date -d "$OLD_DATE" +%s)) / 86400 ))
      if [ "$AGE_DAYS" -gt "$KEEP_DAYS" ]; then
        echo "  Deleting old snapshot: $OLD_SNAP (${AGE_DAYS} days old)"
        qm delsnapshot "$VMID" "$OLD_SNAP" || true
      fi
    fi
  done

  echo "Done with VM $VMID"
done

echo "Snapshot rotation complete."

Libvirt/KVM: Automated Snapshots with Systemd Timers

Systemd timers are more reliable than cron and give you better logging:

# /etc/systemd/system/vm-snapshot.timer
[Unit]
Description=Daily VM snapshot timer

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target

# /etc/systemd/system/vm-snapshot.service
[Unit]
Description=Take and rotate VM snapshots
After=libvirtd.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/kvm-auto-snapshot.sh
StandardOutput=journal
StandardError=journal

#!/bin/bash
# /usr/local/bin/kvm-auto-snapshot.sh
# Automated snapshot management for libvirt/KVM VMs
set -euo pipefail

# VMs to snapshot
VMS="webserver database mediaserver"
KEEP_DAYS=3
DATE=$(date +%Y%m%d)
SNAP_PREFIX="auto-daily"

for VM in $VMS; do
  SNAP_NAME="${SNAP_PREFIX}-${DATE}"

  # Check if VM exists
  if ! virsh dominfo "$VM" &>/dev/null; then
    echo "VM $VM does not exist, skipping"
    continue
  fi

  echo "Creating snapshot '$SNAP_NAME' for VM '$VM'..."
  virsh snapshot-create-as "$VM" "$SNAP_NAME" \
    --description "Automated daily snapshot $(date)" \
    --atomic

  # Clean up old snapshots
  virsh snapshot-list "$VM" --name | grep "^${SNAP_PREFIX}" | while read -r OLD_SNAP; do
    OLD_DATE=$(echo "$OLD_SNAP" | grep -oP '\d{8}$' || true)
    if [ -n "$OLD_DATE" ]; then
      AGE_DAYS=$(( ($(date +%s) - $(date -d "$OLD_DATE" +%s)) / 86400 ))
      if [ "$AGE_DAYS" -gt "$KEEP_DAYS" ]; then
        echo "  Deleting old snapshot: $OLD_SNAP (${AGE_DAYS} days old)"
        virsh snapshot-delete "$VM" "$OLD_SNAP" || true
      fi
    fi
  done
done

echo "Snapshot rotation complete at $(date)"

Enable the timer:

sudo systemctl daemon-reload
sudo systemctl enable --now vm-snapshot.timer

# Verify it's scheduled
systemctl list-timers vm-snapshot.timer

# Check logs after it runs
journalctl -u vm-snapshot.service --since today

Storage Considerations for Snapshot-Heavy Workflows

If you're using snapshots regularly (and you should be), you need to plan your storage accordingly.

Thin Provisioning Is Your Friend

Thin provisioning means the storage backend only allocates space as data is actually written, rather than reserving the full virtual disk size upfront. This is essential for snapshot workflows because snapshots only consume space proportional to the data that's changed.

# Proxmox — check if your storage uses thin provisioning
pvesm status

# Create a thinly provisioned LVM storage
lvcreate -L 500G -T pve/data  # Creates a 500GB thin pool

# ZFS is inherently thin-provisioned
zfs create -V 50G rpool/data/vm-100-disk-0
# Only allocates space as data is written

Estimating Snapshot Storage Needs

A rough formula for snapshot storage:

Snapshot storage = Change rate × Retention period × Number of VMs

For example:

5 VMs each writing 2GB/day of new data
Keeping 3 days of snapshots
Total snapshot overhead: 5 × 2GB × 3 = 30GB

In practice, this varies wildly. Database servers change a lot of data. Web servers serving static content change very little. Monitor your actual usage for a week before setting up retention policies.

Snapshot Storage Best Practices

Practice	Why
Keep snapshots on the same pool as the VM	Required (snapshots are deltas of the base)
Monitor pool free space	Snapshots can fill your pool if you're not watching
Set up alerts at 80% pool usage	Leave headroom for snapshot growth
Don't keep snapshots for more than a few days	Performance degrades and storage accumulates
Delete snapshots before taking new ones if space is tight	Deletion merges the delta, freeing space
Use ZFS or LVM-thin, not thick provisioning	Thick provisioning wastes space with snapshots

When to Use Snapshots vs. Full Backups

This is the decision matrix I use:

Use Snapshots When:

Upgrading software — Snap before, upgrade, test, delete the snap if it works, or roll back if it doesn't
Testing configuration changes — Especially firewall rules, DNS changes, or anything that might lock you out
Before running database migrations — Quick rollback if the migration fails
Quick development iteration — Snap a known-good state, experiment, roll back, repeat
Before applying OS updates — Kernel updates, library updates, etc.

Use Full Backups When:

Long-term data protection — Backups survive storage failure, snapshots don't
Compliance or audit requirements — Even in a homelab, you might want monthly snapshots of certain configs
Before major infrastructure changes — Moving to new storage, new hypervisor, etc.
Migration between hosts — Backups are portable, snapshots usually aren't
Anything you can't afford to lose — If the answer is "I'd be really upset if I lost this," it needs a backup, not just a snapshot

The Ideal Workflow

The best approach combines both:

Automated daily snapshots for quick rollback (keep 3-7 days)
Automated weekly full backups to a separate NAS or storage (keep 4-8 weeks)
Automated monthly backups to offsite/cloud storage (keep 6-12 months)
Manual snapshots before any risky change (delete after verifying the change)

# Example backup schedule combining snapshots and backups
# /etc/cron.d/vm-protection

# Daily snapshots at 3 AM (keep 3 days)
0 3 * * * root /usr/local/bin/proxmox-auto-snapshot.sh

# Weekly full backups on Sunday at 4 AM (keep 4 weeks)
0 4 * * 0 root /usr/local/bin/proxmox-backup.sh weekly

# Monthly full backups on the 1st at 5 AM (keep 6 months)
0 5 1 * * root /usr/local/bin/proxmox-backup.sh monthly

Snapshot Cleanup and Consolidation

Forgetting to clean up snapshots is the most common mistake. Here's how to stay on top of it.

Manual Cleanup Procedure

# Proxmox — list all snapshots across all VMs
for VMID in $(qm list | awk 'NR>1 {print $1}'); do
  echo "=== VM $VMID ==="
  qm listsnapshot "$VMID"
done

# Delete a specific snapshot
qm delsnapshot 100 old-snapshot-name

# libvirt — list all snapshots across all VMs
for VM in $(virsh list --all --name); do
  COUNT=$(virsh snapshot-list "$VM" --name 2>/dev/null | wc -l)
  if [ "$COUNT" -gt 0 ]; then
    echo "=== $VM ($COUNT snapshots) ==="
    virsh snapshot-list "$VM"
  fi
done

Automated Cleanup Script

This script finds and removes snapshots older than a specified age:

#!/bin/bash
# snapshot-cleanup.sh — Find and remove old snapshots across all VMs
set -euo pipefail

MAX_AGE_DAYS="${1:-7}"
DRY_RUN="${2:-false}"

echo "Cleaning up snapshots older than $MAX_AGE_DAYS days"
[ "$DRY_RUN" = "true" ] && echo "(DRY RUN — no changes will be made)"

TOTAL_DELETED=0

for VMID in $(qm list 2>/dev/null | awk 'NR>1 {print $1}'); do
  qm listsnapshot "$VMID" 2>/dev/null | grep -v "current" | while read -r line; do
    SNAP_NAME=$(echo "$line" | awk '{print $2}')
    [ -z "$SNAP_NAME" ] && continue

    # Try to extract date from snapshot name
    SNAP_DATE=$(echo "$SNAP_NAME" | grep -oP '\d{4}-?\d{2}-?\d{2}' | head -1 || true)
    if [ -z "$SNAP_DATE" ]; then
      echo "  Skipping $SNAP_NAME (no date in name)"
      continue
    fi

    # Normalize date format
    SNAP_DATE=$(echo "$SNAP_DATE" | sed 's/\([0-9]\{4\}\)\([0-9]\{2\}\)\([0-9]\{2\}\)/\1-\2-\3/')
    AGE_DAYS=$(( ($(date +%s) - $(date -d "$SNAP_DATE" +%s)) / 86400 ))

    if [ "$AGE_DAYS" -gt "$MAX_AGE_DAYS" ]; then
      if [ "$DRY_RUN" = "true" ]; then
        echo "  Would delete: VM $VMID / $SNAP_NAME (${AGE_DAYS} days old)"
      else
        echo "  Deleting: VM $VMID / $SNAP_NAME (${AGE_DAYS} days old)"
        qm delsnapshot "$VMID" "$SNAP_NAME" || echo "    Failed to delete $SNAP_NAME"
        TOTAL_DELETED=$((TOTAL_DELETED + 1))
      fi
    fi
  done
done

echo "Cleanup complete. Deleted $TOTAL_DELETED snapshots."

Usage:

# Dry run — see what would be deleted
./snapshot-cleanup.sh 7 true

# Actually delete snapshots older than 7 days
./snapshot-cleanup.sh 7

# Delete snapshots older than 3 days
./snapshot-cleanup.sh 3

Practical Snapshot Workflow Examples

Let's walk through some real scenarios where snapshots save the day.

Scenario 1: Kernel Upgrade

# 1. Take a snapshot
qm snapshot 100 before-kernel --description "Before kernel 6.x upgrade"

# 2. SSH into the VM and do the upgrade
ssh root@vm100 "apt update && apt upgrade -y && apt install linux-image-6.8-amd64"

# 3. Reboot and test
ssh root@vm100 "reboot"
# Wait for it to come back...
ssh root@vm100 "uname -r"  # Verify new kernel
ssh root@vm100 "systemctl --failed"  # Check for broken services

# 4a. Everything works — delete the snapshot
qm delsnapshot 100 before-kernel

# 4b. Something's broken — roll back
qm rollback 100 before-kernel
# VM is back to exactly where it was before the upgrade

Scenario 2: Database Migration

# 1. Take a snapshot (include memory state for a consistent DB snapshot)
virsh snapshot-create-as dbserver before-migration \
  --description "Before schema migration v42" \
  --atomic

# 2. Run the migration
ssh root@dbserver "cd /opt/myapp && python manage.py migrate"

# 3. Run verification queries
ssh root@dbserver "psql -U myapp -c 'SELECT count(*) FROM users'"
ssh root@dbserver "psql -U myapp -c 'SELECT count(*) FROM orders'"

# 4a. Migration successful — delete snapshot
virsh snapshot-delete dbserver before-migration

# 4b. Migration failed — roll back
virsh snapshot-revert dbserver before-migration
# Database is back to pre-migration state

Scenario 3: Firewall Rule Testing

# 1. Take a snapshot (this is critical for firewall changes!)
qm snapshot 100 before-firewall --description "Before iptables changes"

# 2. Apply firewall rules with a safety net
ssh root@vm100 bash -c '
  # Schedule a rule flush in 5 minutes in case we lock ourselves out
  echo "iptables -F && iptables -P INPUT ACCEPT" | at now + 5 minutes

  # Apply the new rules
  iptables -A INPUT -p tcp --dport 22 -j ACCEPT
  iptables -A INPUT -p tcp --dport 80 -j ACCEPT
  iptables -A INPUT -p tcp --dport 443 -j ACCEPT
  iptables -A INPUT -j DROP
'

# 3. Test connectivity
curl -s -o /dev/null -w "%{http_code}" http://vm100  # Should be 200

# 4a. Rules work — cancel the safety flush and persist
ssh root@vm100 "atrm 1 && iptables-save > /etc/iptables/rules.v4"
qm delsnapshot 100 before-firewall

# 4b. Locked out — either wait 5 minutes for the flush, or:
qm rollback 100 before-firewall

Tips and Gotchas

Here are the hard-won lessons from years of snapshot use:

1. Never leave manual snapshots around for more than a day or two. You'll forget about them, they'll grow, and eventually you'll wonder why your storage pool is full.

2. Name your snapshots descriptively. snap1 tells you nothing. before-postgresql-16-upgrade-2026-02-09 tells you everything.

3. Don't snapshot VMs with heavy I/O unless you need to. Database servers, for example, generate huge snapshot deltas. Consider stopping the VM briefly for a consistent snapshot, or use application-level backups (like pg_dump) instead.

4. Be careful with snapshot + live migration. Some hypervisors don't support live-migrating a VM that has snapshots. Check your hypervisor's documentation.

5. Monitor your snapshot chain depth. If you see more than 3-4 snapshots in a chain, consolidate. The performance impact is real.

6. Test your rollback procedure before you need it. Don't wait until you're panicking to learn how rollback works. Take a snapshot, make a small change, roll back, verify. Do this in a low-stakes environment first.

7. Document your snapshot policies. Write down which VMs get automated snapshots, how many you keep, and when manual snapshots should be taken. Future you will thank present you.

Wrapping Up

Snapshots are one of the most powerful tools in your homelab toolkit. They turn risky operations into reversible experiments. But they need to be understood and managed — they're not "set and forget."

The key takeaways:

Snapshots are for short-term rollback, not long-term protection — pair them with real backups
Keep chains short — more than 3-4 snapshots in a chain will noticeably impact performance
Automate both creation and cleanup — use the cron jobs and scripts in this guide
Always take a snapshot before risky changes — upgrades, migrations, firewall changes, anything
Name them well and delete them promptly — storage is finite, and forgotten snapshots are a ticking time bomb

Now go break something in your homelab. You've got a snapshot to fall back on.