yellow and gray metal equipment

Prometheus and Grafana: Metrics Monitoring for the Homelab

Monitoring 2026-03-04 · 3 min read prometheus grafana monitoring metrics homelab docker node-exporter cadvisor dashboards
By HomeLab Starter Editorial TeamHome lab enthusiasts covering hardware setup, networking, and self-hosted services for home and small office environments.

Uptime monitoring (is it up?) and metrics monitoring (how is it performing?) are different. Uptime Kuma handles uptime. Prometheus + Grafana handles everything else: CPU usage over time, memory trends, disk I/O, network throughput, container resource consumption. When something is slow or degrading, metrics tell you where.

Photo by Maks Key on Unsplash

Architecture

Docker Compose Stack

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./alert.rules.yml:/etc/prometheus/alert.rules.yml
      - prometheus-data:/prometheus
    command:
      - --config.file=/etc/prometheus/prometheus.yml
      - --storage.tsdb.path=/prometheus
      - --storage.tsdb.retention.time=30d
      - --web.enable-lifecycle  # Allow config reload via HTTP
    ports:
      - 9090:9090

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    volumes:
      - grafana-data:/var/lib/grafana
    environment:
      GF_SECURITY_ADMIN_PASSWORD: change-this
      GF_USERS_ALLOW_SIGN_UP: "false"
    ports:
      - 3000:3000

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - --path.procfs=/host/proc
      - --path.rootfs=/rootfs
      - --path.sysfs=/host/sys
      - --collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)
    network_mode: host  # For accurate network metrics

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    restart: unless-stopped
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    devices:
      - /dev/kmsg
    ports:
      - 8080:8080
    privileged: true

  alertmanager:
    image: prom/alertmanager:latest
    container_name: alertmanager
    restart: unless-stopped
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
    ports:
      - 9093:9093

volumes:
  prometheus-data:
  grafana-data:

Prometheus Configuration

prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["alertmanager:9093"]

rule_files:
  - "alert.rules.yml"

scrape_configs:
  # Prometheus self-monitoring
  - job_name: prometheus
    static_configs:
      - targets: ["localhost:9090"]

  # Linux host metrics
  - job_name: node
    static_configs:
      - targets:
          - node-exporter:9100
          - 192.168.1.51:9100  # Second host
          - 192.168.1.52:9100  # Third host

  # Docker container metrics
  - job_name: cadvisor
    static_configs:
      - targets: ["cadvisor:8080"]

  # Proxmox (via PVE exporter)
  - job_name: proxmox
    static_configs:
      - targets: ["pve-exporter:9221"]

  # Additional exporters...

For hosts not running Docker, install node_exporter as a systemd service:

# On each monitored host
wget https://github.com/prometheus/node_exporter/releases/latest/download/node_exporter-linux-amd64.tar.gz
tar xzf node_exporter*.tar.gz
sudo cp node_exporter-*/node_exporter /usr/local/bin/
sudo useradd -rs /bin/false node_exporter

# Create systemd service
cat > /etc/systemd/system/node_exporter.service << EOF
[Unit]
Description=Node Exporter

[Service]
User=node_exporter
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable --now node_exporter

Alert Rules

alert.rules.yml:

groups:
  - name: homelab
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is {{ $value }}%"

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Low disk space on {{ $labels.instance }}"
          description: "{{ $labels.mountpoint }} has {{ $value }}% free"

      - alert: HighMemoryUsage
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.instance }}"

Alertmanager Configuration

alertmanager.yml:

global:
  smtp_from: [email protected]
  smtp_smarthost: smtp.example.com:587
  smtp_auth_username: [email protected]
  smtp_auth_password: smtp-password

route:
  receiver: email-alerts
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: email-alerts
    email_configs:
      - to: [email protected]
        subject: '[Homelab Alert] {{ .GroupLabels.alertname }}'

  # Slack alternative
  - name: slack-alerts
    slack_configs:
      - api_url: https://hooks.slack.com/services/...
        channel: '#homelab-alerts'
        text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'

Grafana Setup

  1. Open http://your-server:3000, log in with admin/your-password
  2. Add Prometheus data source: Connections → Data Sources → Add → Prometheus → URL: http://prometheus:9090

Pre-built dashboards (import by ID in Grafana → Dashboards → Import):

Import a dashboard:

  1. Grafana → Dashboards → New → Import
  2. Enter dashboard ID
  3. Select Prometheus data source
  4. Import

Proxmox Metrics Integration

Proxmox has built-in Prometheus metrics endpoint (requires enabling):

# On Proxmox host
apt install prometheus-pve-exporter

# Configure /etc/pve-exporter/config.yml
default:
  user: prometheus@pve
  password: monitoring-password
  verify_ssl: false

Or use the built-in Proxmox metrics API (Proxmox 7.2+):

Retention and Storage

Default Prometheus retention is 15 days. For homelab use, 30-90 days is more useful:

command:
  - --storage.tsdb.retention.time=90d

Approximately 5-15MB per monitored host per day at 15s scrape interval. A homelab with 5 hosts uses ~50-100MB/month.

PromQL Basics

Prometheus uses PromQL for queries:

# Current CPU idle percentage per host
100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Memory usage percentage
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100

# Disk usage by mount point
(1 - node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100

# Network receive rate (bytes/sec)
irate(node_network_receive_bytes_total{device="eth0"}[5m])

# Docker container CPU usage
rate(container_cpu_usage_seconds_total{name=~".+"}[5m]) * 100

These form the basis of dashboard panels. Grafana's panel editor shows the resulting graph as you type.

Get free weekly tips in your inbox. Subscribe to HomeLab Starter

More monitoring guides

One focused tutorial every week — no spam, unsubscribe anytime.

Opens Substack to confirm — no spam, unsubscribe anytime.

Before you go...

Get a free weekly guide from HomeLab Starter — one focused topic, delivered every week. No spam.

Opens Substack to confirm — no spam, unsubscribe anytime.