Ceph Storage Cluster: Complete Setup Guide

Storage 2026-02-15 · 15 min read ceph distributed-storage s3 object-storage cephfs rbd clustering
By HomeLab Starter Editorial Team — Home lab enthusiasts covering hardware setup, networking, and self-hosted services for home and small office environments.

Ceph turns a pile of commodity servers into a unified storage platform that provides block devices, a POSIX filesystem, and an S3-compatible object store — all backed by the same cluster. It was built at UC Santa Cruz as a PhD thesis project, entered the Linux kernel in 2010, and now runs some of the largest storage deployments on the planet. Red Hat, SUSE, Canonical, and Proxmox all ship it as a first-class storage backend.

Photo by Terra Slaybaugh on Unsplash

This guide is a complete walkthrough for building a Ceph cluster in a homelab: understanding the architecture, planning hardware, deploying with cephadm, configuring each storage interface, and tuning it to run well on modest hardware.

Understanding Ceph Architecture

Before you touch a command line, you need to understand what Ceph is actually doing. The architecture has four daemon types, a placement algorithm, and a unified storage layer. Every design decision in Ceph flows from one principle: no single point of failure.

RADOS: The Foundation

RADOS (Reliable Autonomic Distributed Object Store) is the core of Ceph. Everything — block devices, filesystems, object storage — is built on top of RADOS. At its heart, RADOS stores objects (binary blobs up to several megabytes) distributed across a cluster of storage nodes using a deterministic algorithm called CRUSH.

When a client writes a file to CephFS, that file is split into objects. When a VM writes to an RBD block device, each block becomes an object. When you PUT an object via the S3 gateway, it becomes a RADOS object. The interfaces differ, but the storage layer is the same.

The Four Daemon Types

MON (Monitor) — Maintains the cluster map: which nodes exist, which OSDs are up, how data should be distributed. Monitors form a Paxos quorum, so you need an odd number (3 or 5). They don't handle data I/O — they're the cluster's brain, storing the authoritative copy of the CRUSH map, the OSD map, the MON map, and the MDS map.

MON responsibilities:
├── Cluster membership (which nodes are alive)
├── CRUSH map (how data is distributed)
├── Authentication (cephx keys)
└── Configuration database

OSD (Object Storage Daemon) — One OSD per physical disk. OSDs store data, handle replication, recovery, and rebalancing. They report their status to monitors and peer directly with each other for replication. A 3-node cluster with 4 disks per node runs 12 OSDs.

Each OSD uses BlueStore as its storage backend, which writes directly to the raw block device — no filesystem layer in between. BlueStore manages its own write-ahead log (WAL) and metadata database (RocksDB), which can optionally live on a faster device.

MDS (Metadata Server) — Required only for CephFS. MDS handles POSIX filesystem metadata: directory listings, file attributes, permissions, and the inode namespace. Data blocks are stored in RADOS directly; MDS only handles metadata operations. You need at least one active MDS for CephFS, with standbys for failover.

MGR (Manager) — Runs cluster management modules: the web dashboard, Prometheus metrics exporter, balancer, crash collector, and various plugins. MGR doesn't handle data I/O but provides monitoring and management interfaces. One active, one or more standbys.

The CRUSH Algorithm

CRUSH (Controlled Replication Under Scalable Hashing) is what makes Ceph different from most distributed storage systems. Instead of looking up data locations in a central table, both clients and OSDs use the CRUSH algorithm to calculate where data should be stored. This means:

No metadata server bottleneck for data placement
Clients can talk directly to the OSD that holds their data
Adding or removing OSDs requires moving only a proportional fraction of data

CRUSH uses a hierarchical map of your infrastructure:

root default
├── host ceph-node1
│   ├── osd.0 (sdb, 2TB HDD)
│   ├── osd.1 (sdc, 2TB HDD)
│   └── osd.2 (sdd, 2TB HDD)
├── host ceph-node2
│   ├── osd.3 (sdb, 2TB HDD)
│   ├── osd.4 (sdc, 2TB HDD)
│   └── osd.5 (sdd, 2TB HDD)
└── host ceph-node3
    ├── osd.6 (sdb, 2TB HDD)
    ├── osd.7 (sdc, 2TB HDD)
    └── osd.8 (sdd, 2TB HDD)

The CRUSH rule "replicate 3 times, each copy on a different host" ensures your data survives the loss of any single node.

Hardware Planning

Minimum Cluster Specification

Component	Minimum	Recommended for Homelab
Nodes	3 (hard minimum for quorum)	3-5
Network	10 Gbps between nodes	25 Gbps or dual 10 Gbps
RAM per OSD	4 GB	5-8 GB
CPU per OSD	1 core	2 cores
OS disk	64 GB SSD (separate from data disks)	128 GB SSD
OSD disks	HDD or SSD (don't mix types in a pool)	NVMe for performance, HDD for capacity
WAL/DB device	Same disk for SSD OSDs; NVMe for HDD OSDs	NVMe (1 per 4-6 HDD OSDs)

RAM Budget

RAM is the most constrained resource. Each OSD uses ~4-8 GB, monitors use ~2-4 GB, and MDS can be RAM-hungry if you have millions of files.

Example: 3 nodes, 3 OSDs per node, CephFS enabled
Per node:
  3 OSDs × 5 GB            = 15 GB
  1 MON                    =  2 GB
  1 MGR                    =  1 GB
  1 MDS (if CephFS)        =  4 GB
  OS + other workloads     =  8 GB
  ─────────────────────────────────
  Total per node           = 30 GB

Minimum RAM: 32 GB per node

Network Design

Ceph replication multiplies network traffic. A single write with 3x replication generates 3x the network traffic. Separate your public (client-facing) and cluster (replication) networks:

Client network:    10.0.0.0/24  (clients, monitors, management)
Cluster network:   10.0.1.0/24  (OSD replication traffic only)

If you only have one network, Ceph still works — but replication traffic competes with client I/O. At minimum, use 10 Gbps. At 1 Gbps, Ceph will technically function but sequential throughput will be painful.

Disk Selection

For capacity (media, backups, archives): Use spinning HDDs. Put the BlueStore WAL and DB on an NVMe device to compensate for HDD random I/O limitations. One 500 GB NVMe can serve WAL/DB for 4-6 HDD OSDs.

For performance (VMs, databases, active workloads): Use SSDs or NVMe drives directly as OSDs. WAL/DB can live on the same device.

Never mix HDD and SSD OSDs in the same pool. Use separate CRUSH device classes and pools:

# Ceph automatically detects device class, but you can override
ceph osd crush set-device-class hdd osd.0 osd.1 osd.2
ceph osd crush set-device-class ssd osd.9 osd.10 osd.11

Deployment with Cephadm

Cephadm is the officially recommended deployment tool since Ceph Octopus (15.x). It uses containers (podman by default, docker as fallback) and manages daemons as systemd services.

Prerequisites

On all nodes:

# Install prerequisites (Ubuntu/Debian)
sudo apt update
sudo apt install -y podman lvm2 chrony

# Or for RHEL/Fedora
sudo dnf install -y podman lvm2 chrony

# Ensure time sync is working (Ceph requires <50ms clock skew)
sudo systemctl enable --now chronyd
chronyc tracking

Bootstrap the First Node

# Download cephadm (use the latest stable release)
curl --silent --remote-name --location \
  https://download.ceph.com/rpm-squid/el9/noarch/cephadm
chmod +x cephadm
sudo mv cephadm /usr/local/bin/

# Bootstrap the cluster
sudo cephadm bootstrap \
  --mon-ip 10.0.0.10 \
  --cluster-network 10.0.1.0/24 \
  --allow-fqdn-hostname \
  --dashboard-password-noupdate \
  --initial-dashboard-password "changeme123"

Bootstrap creates the first MON and MGR, enables the dashboard, and generates SSH keys for adding other nodes. Save the output — it contains the dashboard URL and credentials.

# Verify the bootstrap succeeded
sudo ceph -s
# Should show: health: HEALTH_WARN (expected — only 1 MON so far)

Add Remaining Nodes

# Copy the SSH public key to other nodes
sudo ceph cephadm get-pub-key > /tmp/ceph.pub
ssh-copy-id -f -i /tmp/ceph.pub root@ceph-node2
ssh-copy-id -f -i /tmp/ceph.pub root@ceph-node3

# Add hosts
sudo ceph orch host add ceph-node2 10.0.0.11
sudo ceph orch host add ceph-node3 10.0.0.12

# Verify hosts are visible
sudo ceph orch host ls

Cephadm automatically deploys MON and MGR daemons across the new hosts. Wait a minute, then check:

sudo ceph -s
# health: HEALTH_OK
# mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
# mgr: ceph-node1(active, since ...), standbys: ceph-node2

Deploy OSDs

# List all available (unformatted, unmounted) disks
sudo ceph orch device ls

# Option 1: Add all available devices automatically
sudo ceph orch apply osd --all-available-devices

# Option 2: Add specific devices with WAL/DB on NVMe
sudo ceph orch daemon add osd ceph-node1:/dev/sdb
sudo ceph orch daemon add osd ceph-node1:/dev/sdc
sudo ceph orch daemon add osd ceph-node1:/dev/sdd

# For HDD OSDs with NVMe WAL/DB (significant performance improvement)
sudo ceph orch daemon add osd ceph-node1:/dev/sdb --db-devices /dev/nvme0n1

# Verify OSDs are up
sudo ceph osd tree

A healthy osd tree output shows all OSDs distributed across hosts:

ID  CLASS  WEIGHT   TYPE NAME            STATUS  REWEIGHT  PRI-AFF
-1         16.37500 root default
-3          5.45833     host ceph-node1
 0    hdd   1.81944         osd.0            up   1.00000  1.00000
 1    hdd   1.81944         osd.1            up   1.00000  1.00000
 2    hdd   1.81944         osd.2            up   1.00000  1.00000
-5          5.45833     host ceph-node2
 3    hdd   1.81944         osd.3            up   1.00000  1.00000
 4    hdd   1.81944         osd.4            up   1.00000  1.00000
 5    hdd   1.81944         osd.5            up   1.00000  1.00000
-7          5.45833     host ceph-node3
 6    hdd   1.81944         osd.6            up   1.00000  1.00000
 7    hdd   1.81944         osd.7            up   1.00000  1.00000
 8    hdd   1.81944         osd.8            up   1.00000  1.00000

Want more storage guides? Get guides like this in your inbox — HomeLab Starter delivers one free deep-dive every week.

Pool Configuration

Creating Replicated Pools

Pools are the basic unit of data placement. Each pool has a replication factor (size) and a set of placement groups (PGs).

# Create a replicated pool for VM block devices
sudo ceph osd pool create rbd-pool 64 64 replicated
sudo ceph osd pool set rbd-pool size 3
sudo ceph osd pool set rbd-pool min_size 2
sudo ceph osd pool application enable rbd-pool rbd

# Create a pool for general object storage
sudo ceph osd pool create object-pool 32 32 replicated
sudo ceph osd pool set object-pool size 3
sudo ceph osd pool application enable object-pool rgw

Placement Group (PG) calculation: The number of PGs affects data distribution. Too few = uneven distribution. Too many = excess memory usage. The formula:

PGs = (OSDs × 100) / replication_size

Example: 9 OSDs, size 3
PGs = (9 × 100) / 3 = 300 → round to nearest power of 2 = 256

For a small homelab, 64-128 PGs per pool is usually right. Let the PG autoscaler handle this:

sudo ceph osd pool set rbd-pool pg_autoscale_mode on

Erasure Coded Pools

Erasure coding provides better space efficiency than replication at the cost of higher CPU usage and some limitations (no partial overwrites, which means no RBD without a cache tier).

# Create a k=2, m=1 profile (67% efficiency vs. 33% for 3x replication)
sudo ceph osd erasure-code-profile set homelab-ec k=2 m=1

# Create the pool
sudo ceph osd pool create archive-pool erasure homelab-ec
sudo ceph osd pool application enable archive-pool rgw

# Verify the profile
sudo ceph osd erasure-code-profile get homelab-ec

Use erasure coded pools for cold/archival data. Use replicated pools for anything requiring random writes (VMs, databases, CephFS metadata).

CRUSH Rules for Device Classes

If you have both HDDs and SSDs, create separate CRUSH rules so fast and slow pools don't mix:

# Create CRUSH rules for each device class
sudo ceph osd crush rule create-replicated replicated_hdd default host hdd
sudo ceph osd crush rule create-replicated replicated_ssd default host ssd

# Apply to pools
sudo ceph osd pool set rbd-pool crush_rule replicated_ssd
sudo ceph osd pool set archive-pool crush_rule replicated_hdd

Setting Up CephFS

CephFS is Ceph's POSIX-compliant distributed filesystem. It's ideal for shared storage accessible from multiple clients — home directories, media libraries, shared project files.

Deploy MDS Daemons

# Create a CephFS filesystem (automatically creates metadata and data pools)
sudo ceph fs volume create homelabfs

# Cephadm deploys MDS daemons automatically, but you can control placement
sudo ceph orch apply mds homelabfs --placement="3 ceph-node1 ceph-node2 ceph-node3"

# Verify MDS status
sudo ceph fs status homelabfs

Output:

homelabfs - 0 clients
=========
RANK  STATE      MDS         ACTIVITY     DNS    INOS   DIRS   CAPS
 0    active     ceph-node1  Reqs:    0/s    10     13     12      0
      POOL           TYPE     USED  AVAIL
cephfs.homelabfs.meta  metadata   96k   5.4T
cephfs.homelabfs.data  data        0   5.4T
      STANDBY MDS
ceph-node2
ceph-node3

Mount CephFS on Clients

Kernel mount (faster, recommended for Linux clients):

# Get the admin key
sudo ceph auth get-key client.admin

# Mount with kernel driver
sudo mount -t ceph ceph-node1,ceph-node2,ceph-node3:/ /mnt/cephfs \
  -o name=admin,secret=AQBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==

# Or use a keyring file
echo "AQBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==" > /etc/ceph/admin.secret
chmod 600 /etc/ceph/admin.secret
sudo mount -t ceph ceph-node1,ceph-node2,ceph-node3:/ /mnt/cephfs \
  -o name=admin,secretfile=/etc/ceph/admin.secret

FUSE mount (works on any OS with FUSE support, slightly slower):

sudo apt install ceph-fuse
sudo ceph-fuse /mnt/cephfs

Persistent mount via /etc/fstab:

# /etc/fstab entry for CephFS
ceph-node1,ceph-node2,ceph-node3:/  /mnt/cephfs  ceph  name=admin,secretfile=/etc/ceph/admin.secret,noatime,_netdev  0  0

CephFS Subvolumes

Use subvolumes to isolate tenants or workloads with individual quotas:

# Create subvolumes
sudo ceph fs subvolumegroup create homelabfs general
sudo ceph fs subvolume create homelabfs media general --size 2000000000000  # 2TB quota
sudo ceph fs subvolume create homelabfs backups general --size 5000000000000  # 5TB quota

# Get the mount path for a subvolume
sudo ceph fs subvolume getpath homelabfs media general
# Returns: /volumes/general/media/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Setting Up RBD (Block Devices)

RBD provides virtual block devices — thin-provisioned, snapshotable, and clonable. They're the primary storage backend for VMs in Proxmox, OpenStack, and Kubernetes.

# Create a 100 GB block device image
sudo rbd create --size 102400 rbd-pool/vm-disk-100

# List images
sudo rbd ls rbd-pool

# Get image info
sudo rbd info rbd-pool/vm-disk-100

# Enable features useful for VM workloads
sudo rbd feature enable rbd-pool/vm-disk-100 exclusive-lock object-map fast-diff

Snapshots and Clones

# Create a snapshot
sudo rbd snap create rbd-pool/vm-disk-100@before-upgrade

# List snapshots
sudo rbd snap ls rbd-pool/vm-disk-100

# Clone from a snapshot (instant, copy-on-write)
sudo rbd snap protect rbd-pool/vm-disk-100@before-upgrade
sudo rbd clone rbd-pool/vm-disk-100@before-upgrade rbd-pool/vm-disk-100-clone

# Rollback to a snapshot
sudo rbd snap rollback rbd-pool/vm-disk-100@before-upgrade

Mapping RBD on a Client

# Map the image as a block device
sudo rbd map rbd-pool/vm-disk-100
# Output: /dev/rbd0

# Format and mount
sudo mkfs.xfs /dev/rbd0
sudo mkdir -p /mnt/rbd-disk
sudo mount /dev/rbd0 /mnt/rbd-disk

# Persistent mapping via rbdmap
echo "rbd-pool/vm-disk-100 id=admin,keyring=/etc/ceph/ceph.client.admin.keyring" >> /etc/ceph/rbdmap
sudo systemctl enable rbdmap

Setting Up RADOS Gateway (S3-Compatible Object Storage)

RADOS Gateway (RGW) provides an HTTP API compatible with Amazon S3 (and partially with Swift). This gives you a private S3-compatible endpoint for backups, application storage, or any tool that speaks S3.

Deploy RGW

# Deploy RGW daemons via cephadm
sudo ceph orch apply rgw homelab-s3 \
  --placement="2 ceph-node1 ceph-node2" \
  --port=7480

# Verify RGW is running
sudo ceph orch ls --service-type=rgw
sudo ceph -s
# Should show: rgw: 2 daemons active (2 hosts, 1 zones)

Create Users and Buckets

# Create an S3 user
sudo radosgw-admin user create \
  --uid=homelab-user \
  --display-name="Homelab S3 User" \
  --access-key=HOMELABKEY123 \
  --secret-key=HOMELABSECRET456

# Verify user creation
sudo radosgw-admin user info --uid=homelab-user

Configure an S3 Client

Test with the AWS CLI:

# Configure AWS CLI to point at your RGW
aws configure set aws_access_key_id HOMELABKEY123
aws configure set aws_secret_access_key HOMELABSECRET456
aws configure set default.region us-east-1

# Create a bucket
aws --endpoint-url http://ceph-node1:7480 s3 mb s3://backups

# Upload a file
aws --endpoint-url http://ceph-node1:7480 s3 cp /tmp/test.txt s3://backups/

# List bucket contents
aws --endpoint-url http://ceph-node1:7480 s3 ls s3://backups/

Or configure tools like restic, rclone, or MinIO Client to use your RGW endpoint:

# rclone config for Ceph RGW
# ~/.config/rclone/rclone.conf
[ceph-s3]
type = s3
provider = Ceph
access_key_id = HOMELABKEY123
secret_access_key = HOMELABSECRET456
endpoint = http://ceph-node1:7480
acl = private

# Test with rclone
rclone ls ceph-s3:backups

RGW with TLS

For production use, put RGW behind a reverse proxy (Caddy, Traefik, or nginx) with TLS:

# Caddyfile
s3.homelab.local {
    reverse_proxy ceph-node1:7480 ceph-node2:7480 {
        lb_policy round_robin
        health_uri /swift/healthcheck
        health_interval 10s
    }
}

Bucket Lifecycle Policies

Set automatic expiration for old objects:

# Create a lifecycle policy
cat > /tmp/lifecycle.json << EOF
{
  "Rules": [
    {
      "ID": "expire-old-backups",
      "Filter": {"Prefix": "daily/"},
      "Status": "Enabled",
      "Expiration": {"Days": 90}
    }
  ]
}
EOF

aws --endpoint-url http://ceph-node1:7480 \
  s3api put-bucket-lifecycle-configuration \
  --bucket backups \
  --lifecycle-configuration file:///tmp/lifecycle.json

Monitoring

Built-in Dashboard

Cephadm installs a web dashboard during bootstrap. Access it at https://<mon-ip>:8443. It provides:

Cluster health overview
OSD status and performance
Pool utilization
Host and service inventory

Prometheus Integration

For integration with your existing monitoring stack, enable the Prometheus exporter:

# Enable the Prometheus metrics module
sudo ceph mgr module enable prometheus

# Metrics are exposed on port 9283 by default
curl -s http://ceph-node1:9283/metrics | head -20

Add to your Prometheus configuration:

# prometheus.yml
scrape_configs:
  - job_name: 'ceph-cluster'
    honor_labels: true
    static_configs:
      - targets:
          - 'ceph-node1:9283'
          - 'ceph-node2:9283'
          - 'ceph-node3:9283'

Key Metrics and Alert Rules

# ceph-alerts.yml (Prometheus alerting rules)
groups:
  - name: ceph
    rules:
      - alert: CephHealthWarning
        expr: ceph_health_status == 1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Ceph cluster health is WARNING"

      - alert: CephHealthError
        expr: ceph_health_status == 2
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Ceph cluster health is ERROR"

      - alert: CephOSDDown
        expr: ceph_osd_up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Ceph OSD {{ $labels.ceph_daemon }} is down"

      - alert: CephPoolNearFull
        expr: (ceph_pool_bytes_used / ceph_pool_max_avail) > 0.75
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Ceph pool {{ $labels.name }} is >75% full"

      - alert: CephSlowOps
        expr: ceph_healthcheck_slow_ops > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Ceph has slow operations"

Grafana Dashboards

Import the official Ceph dashboards from Grafana's library:

Dashboard ID 2842: Ceph Cluster overview
Dashboard ID 5336: Ceph Pools
Dashboard ID 5342: Ceph OSD (per-OSD performance)

# Import via Grafana CLI or API
curl -X POST http://admin:admin@grafana:3000/api/dashboards/import \
  -H "Content-Type: application/json" \
  -d '{"dashboard": {"id": 2842}, "overwrite": true, "inputs": [{"name": "DS_PROMETHEUS", "type": "datasource", "pluginId": "prometheus", "value": "Prometheus"}]}'

Performance Tuning

Network Optimization

# Enable jumbo frames on all Ceph interfaces (all nodes)
sudo ip link set enp5s0 mtu 9000
sudo ip link set enp6s0 mtu 9000

# Make persistent in NetworkManager
sudo nmcli connection modify ceph-public 802-3-ethernet.mtu 9000
sudo nmcli connection modify ceph-cluster 802-3-ethernet.mtu 9000

# Verify throughput between nodes
iperf3 -s  # On node A
iperf3 -c 10.0.1.11 -P 4 -t 30  # From node B (should be close to line rate)

BlueStore Tuning

# Increase OSD memory target for SSD/NVMe OSDs (default 4 GB)
sudo ceph config set osd osd_memory_target 6442450944  # 6 GB

# Increase bluestore cache for read-heavy workloads
sudo ceph config set osd bluestore_cache_size_hdd 1073741824   # 1 GB for HDD
sudo ceph config set osd bluestore_cache_size_ssd 3221225472   # 3 GB for SSD

# Tune min_alloc_size for SSDs (4K is default, good for most workloads)
sudo ceph config set osd bluestore_min_alloc_size_ssd 4096

Recovery and Backfill Tuning

When an OSD goes down and comes back, Ceph recovers data. By default, recovery is conservative to avoid impacting client I/O. In a homelab, you can afford to be more aggressive:

# Increase recovery speed (higher = faster recovery, more I/O impact)
sudo ceph config set osd osd_recovery_max_active 5      # default: 3
sudo ceph config set osd osd_max_backfills 3             # default: 1
sudo ceph config set osd osd_recovery_sleep_hdd 0.05     # default: 0.1
sudo ceph config set osd osd_recovery_sleep_ssd 0.0      # default: 0

# Adjust recovery priority during maintenance windows
sudo ceph osd set-recovery-priority 10  # Higher = more aggressive

Client-Side Tuning

# For kernel RBD, increase queue depth
echo 128 > /sys/block/rbd0/queue/nr_requests

# For CephFS kernel mounts, increase readahead
echo 4096 > /sys/block/rbd0/queue/read_ahead_kb

# For librbd (QEMU/KVM), tune in the Ceph config
sudo ceph config set client rbd_cache true
sudo ceph config set client rbd_cache_size 67108864        # 64 MB
sudo ceph config set client rbd_cache_max_dirty 50331648   # 48 MB
sudo ceph config set client rbd_cache_target_dirty 33554432 # 32 MB

PG Autoscaler

Let the PG autoscaler handle placement group optimization. It adjusts PG counts based on pool usage:

# Enable globally
sudo ceph config set global osd_pool_default_pg_autoscale_mode on

# Check current PG distribution
sudo ceph osd pool autoscale-status

Day-Two Operations

Adding a New OSD

# Insert a new disk, then:
sudo ceph orch device ls --hostname=ceph-node1
sudo ceph orch daemon add osd ceph-node1:/dev/sde

# Monitor rebalancing
sudo ceph -w  # Watch cluster events in real time

Removing an OSD

# Mark OSD out (triggers data migration)
sudo ceph osd out osd.5

# Wait for rebalancing to complete
sudo ceph -w

# Remove the OSD daemon
sudo ceph orch osd rm 5

# Purge the OSD from the CRUSH map
sudo ceph osd purge 5 --yes-i-really-mean-it

Upgrading Ceph

# Check current version
sudo ceph versions

# Start the rolling upgrade (cephadm handles the order: MON → MGR → OSD → MDS → RGW)
sudo ceph orch upgrade start --image quay.io/ceph/ceph:v19.2.0

# Monitor progress
sudo ceph orch upgrade status

Health Checks

Run these regularly:

# Overall status
sudo ceph -s

# Check for slow OSDs
sudo ceph osd perf | sort -k2 -n -r | head -5

# Check for PGs in unusual states
sudo ceph pg stat

# Verify all OSDs have similar utilization
sudo ceph osd df | sort -k7 -n

# Check for clock skew
sudo ceph time-sync-status

Ceph is not simple. It has more moving parts than any other storage solution you can run in a homelab. But it solves a specific problem that nothing else does: providing unified block, file, and object storage across multiple nodes with no single point of failure. If your homelab has grown to 3+ nodes and you need shared storage that survives hardware failures, Ceph is the tool for the job. If you're not there yet, start with ZFS on a single node and grow into Ceph when the need is real.