Homelab Temperature Monitoring: Sensors, Alerts, and Dashboards

Monitoring 2026-03-04 · 4 min read temperature lm-sensors monitoring grafana prometheus fan control homelab thermal
By HomeLab Starter Editorial Team — Home lab enthusiasts covering hardware setup, networking, and self-hosted services for home and small office environments.

Heat is the enemy of homelab hardware. A CPU running 20°C too hot for months degrades the chip. A hard drive above 45°C shortens its lifespan. An unchecked GPU at 90°C will eventually trigger thermal throttling, then failure.

Photo by BINGYEN STUDIO on Unsplash

Setting up temperature monitoring takes less than an hour and provides visibility into your hardware's thermal health before problems develop.

Reading Sensors with lm-sensors

lm-sensors is the standard Linux tool for reading CPU, motherboard, and SSD temperatures.

Install:

apt install lm-sensors

Detect hardware sensors:

sudo sensors-detect  # follow prompts, say yes to all defaults

This scans your hardware and loads appropriate kernel modules.

Read current temperatures:

sensors

Sample output:

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +45.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:        +42.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:        +43.0°C

nvme-pci-0300
Adapter: PCI adapter
Composite:     +35.9°C  (low  = -273.1°C, high = +84.8°C)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +50.2°C

The output depends on your hardware. Intel CPUs show per-core temperatures via coretemp. AMD CPUs use k10temp. NVMe drives show composite temperature.

Hard Drive Temperatures with hddtemp/smartmontools

For spinning hard drives, use smartmontools:

apt install smartmontools

# Check temperature of a specific drive
sudo smartctl -A /dev/sda | grep Temperature

Output:

194 Temperature_Celsius  ... 35 (Min/Max 18/44)

For bulk monitoring of multiple drives:

for dev in /dev/sd{a,b,c,d}; do
  echo "$dev: $(sudo smartctl -A $dev | grep -i temp | awk '{print $10}')°C"
done

GPU Temperature (NVIDIA)

nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader
# Example output: 65

For fan speed and power:

nvidia-smi --query-gpu=temperature.gpu,fan.speed,power.draw --format=csv

Want more monitoring guides? Get guides like this in your inbox — HomeLab Starter delivers one free deep-dive every week.

GPU Temperature (AMD)

cat /sys/class/drm/card0/device/hwmon/hwmon*/temp1_input
# Output in millidegrees: 65000 = 65°C

Or via sensors if the amdgpu kernel module is loaded.

Prometheus Node Exporter for Continuous Monitoring

For continuous monitoring integrated with Prometheus and Grafana, use node_exporter:

services:
  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    network_mode: host
    pid: host
    volumes:
      - /:/host:ro,rslave
    command:
      - '--path.rootfs=/host'
      - '--collector.hwmon'   # enables temperature sensors
      - '--collector.diskstats'
      - '--collector.filesystem'

With --collector.hwmon, node_exporter exposes all lm-sensors readings as Prometheus metrics:

node_hwmon_temp_celsius{chip="coretemp-isa-0000",sensor="temp1"} 45.0
node_hwmon_fan_rpm{chip="nct6796-isa-0a10",sensor="fan1"} 1200.0

Prometheus Configuration

Add the node exporter as a scrape target:

# prometheus.yml
scrape_configs:
  - job_name: 'homelab-node'
    static_configs:
      - targets: ['your-server-ip:9100']

Grafana Temperature Dashboard

A useful dashboard configuration for CPU and drive temperatures:

CPU Core Temperature (average):

avg(node_hwmon_temp_celsius{chip=~"coretemp.*", sensor=~"temp.*_input"})

CPU Core Temperature (max):

max(node_hwmon_temp_celsius{chip=~"coretemp.*", sensor=~"temp.*_input"})

NVMe Temperature:

node_hwmon_temp_celsius{chip=~"nvme.*", sensor="temp1"}

Fan Speeds:

node_hwmon_fan_rpm{chip=~".*", sensor=~"fan.*"}

Use a gauge panel for current temperatures and a time series panel for historical trends.

Import a Pre-Built Dashboard

Grafana's dashboard library includes several node exporter dashboards with temperature support. Dashboard ID 1860 ("Node Exporter Full") includes temperature panels.

Grafana → Dashboards → Import
Enter ID: 1860
Select your Prometheus data source
Import

You may need to customize selectors for your specific sensor names.

Alertmanager Rules for Thermal Alerts

Configure alerts when temperatures exceed safe thresholds:

# prometheus/rules/temperature.yml
groups:
  - name: temperature
    rules:
      - alert: CPUHighTemperature
        expr: max(node_hwmon_temp_celsius{chip=~"coretemp.*"}) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "CPU temperature high on {{ $labels.instance }}"
          description: "CPU temperature is {{ $value }}°C for 5+ minutes"

      - alert: CPUCriticalTemperature
        expr: max(node_hwmon_temp_celsius{chip=~"coretemp.*"}) > 95
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "CPU temperature critical on {{ $labels.instance }}"

      - alert: DiskHighTemperature
        expr: node_hwmon_temp_celsius{chip=~"nvme.*"} > 70
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "NVMe drive temperature high"

Send alerts to Discord, Slack, or email via Alertmanager's notification integrations.

Safe Temperature Ranges

Guidelines by component type:

Component	Normal	Warning	Critical
Intel CPU (idle)	30-45°C	70°C+	90°C+
Intel CPU (load)	50-75°C	85°C+	95°C+
AMD Ryzen (load)	60-80°C	90°C+	95°C+
NVMe SSD	30-45°C	65°C+	75°C+
SATA SSD	25-40°C	60°C+	70°C+
HDD	30-40°C	45°C+	50°C+
GPU (gaming load)	65-80°C	85°C+	90°C+

These are general guidelines. Check your specific component's datasheet for rated operating temperatures.

Fan Control with fancontrol

fancontrol (part of lm-sensors) lets you set PWM fan curves based on temperature:

# Generate a fancontrol config
sudo pwmconfig

# Start fan control daemon
sudo systemctl enable --now fancontrol

pwmconfig interactively guides you through mapping fans to temperature sensors and setting temperature curves. The generated config file at /etc/fancontrol defines:

Which temperature sensor controls which fan
Temperature thresholds for fan speeds
Min/max fan speeds

This prevents fans from running at full speed constantly (noisy) while ensuring adequate cooling at high temperatures.

Ambient Temperature Monitoring

For rack cooling verification, add an ambient temperature sensor:

USB temperature sensor: Generic USB HID thermometers work with usb-sensors or custom scripts.

ESPHome / Tasmota sensor: Attach a DHT22 or DS18B20 sensor to an ESP8266 and expose it as a MQTT or HTTP endpoint. Read it from Prometheus with a custom exporter.

Home Assistant integration: If you run HA, expose your homelab rack temperature as an entity and pull it into Prometheus via the HA metrics endpoint.

Tracking ambient temperature alongside CPU and drive temperatures helps diagnose cooling issues: if ambient rises and CPU temps follow, you have airflow or cooling problems in the room, not the server.