← All articles
A wooden block spelling switch on a table

Setting Up Grafana Alloy for Homelab Observability

Monitoring 2026-02-15 · 7 min read grafana alloy observability monitoring opentelemetry metrics
By HomeLab Starter Editorial TeamHome lab enthusiasts covering hardware setup, networking, and self-hosted services for home and small office environments.

Running separate agents for metrics, logs, and traces gets old fast. You end up with Promtail on every machine for logs, Node Exporter for metrics, maybe an OTEL Collector somewhere for traces, each with its own config format and deployment lifecycle. Grafana Alloy replaces all of them with a single binary that handles metrics, logs, and traces through one unified configuration.

Photo by Markus Winkler on Unsplash

Alloy is Grafana Labs' successor to Grafana Agent. It shipped as a stable release in early 2024, and Grafana Agent has been officially deprecated since. If you're still running Grafana Agent or juggling multiple collection agents across your homelab, Alloy is the upgrade worth making.

Grafana logo

What Is Grafana Alloy

Grafana Alloy is an OpenTelemetry-compatible telemetry collector. It collects metrics, logs, and traces from your infrastructure and applications, processes them with pipeline stages, and ships them to backends like Prometheus, Loki, Tempo, or any OTLP-compatible endpoint.

The key differences from the old Grafana Agent:

Architecture Overview

Alloy runs on every machine you want to observe. Each instance collects local metrics, scrapes log files, and optionally receives traces. Everything flows to your central backends.

[Machine 1: Alloy] ──metrics──→ Prometheus
                    ──logs────→ Loki
                    ──traces──→ Tempo
[Machine 2: Alloy] ──metrics──→ Prometheus
                    ──logs────→ Loki
                         ↑
                    [Grafana queries all three backends]

For smaller homelabs (2-5 machines), run the backends and Alloy on the same box. For larger setups, dedicate a machine to your monitoring stack and run Alloy agents on everything else.

Installation

Docker

docker run -d \
  --name alloy \
  --restart=unless-stopped \
  --net=host --pid=host \
  -v /:/host:ro,rslave \
  -v /var/log:/var/log:ro \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v ./config.alloy:/etc/alloy/config.alloy \
  grafana/alloy:latest \
  run /etc/alloy/config.alloy \
  --server.http.listen-addr=0.0.0.0:12345 \
  --stability.level=generally-available

The --net=host and --pid=host flags give Alloy access to accurate host metrics. The Docker socket mount enables container discovery.

Bare Metal (Debian/Ubuntu)

sudo mkdir -p /etc/apt/keyrings
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install alloy
sudo systemctl enable --now alloy

For Fedora/RHEL: sudo dnf config-manager --add-repo https://rpm.grafana.com && sudo dnf install alloy

The config file lives at /etc/alloy/config.alloy. The built-in UI is at http://localhost:12345.

Kubernetes (Helm)

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install alloy grafana/alloy -n monitoring --create-namespace -f values.yaml

Like what you're reading? Subscribe to HomeLab Starter — free weekly guides in your inbox.

Full Docker Compose Stack

A complete observability stack with Alloy, Prometheus, Loki, and Grafana:

# ~/monitoring/docker-compose.yml
services:
  alloy:
    image: grafana/alloy:latest
    container_name: alloy
    restart: unless-stopped
    network_mode: host
    pid: host
    volumes:
      - /:/host:ro,rslave
      - /var/log:/var/log:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./config.alloy:/etc/alloy/config.alloy
    command:
      - run
      - /etc/alloy/config.alloy
      - --server.http.listen-addr=0.0.0.0:12345
      - --stability.level=generally-available

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - prometheus_data:/prometheus
    command:
      - '--storage.tsdb.retention.time=90d'
      - '--web.enable-remote-write-receiver'
      - '--config.file=/etc/prometheus/prometheus.yml'

  loki:
    image: grafana/loki:3.3.2
    container_name: loki
    restart: unless-stopped
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yml:/etc/loki/loki-config.yml
      - loki_data:/loki
    command: -config.file=/etc/loki/loki-config.yml

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      GF_SECURITY_ADMIN_PASSWORD: "changeme"
      GF_USERS_ALLOW_SIGN_UP: "false"

volumes:
  prometheus_data:
  loki_data:
  grafana_data:

The --web.enable-remote-write-receiver flag on Prometheus is critical — it lets Alloy push metrics via the remote write API instead of requiring Prometheus to scrape each Alloy instance.

Configuration Walkthrough

Alloy uses the River configuration language. If you've used HCL (Terraform), River will feel familiar. Every block is a component with a type, an optional label, and arguments. Components reference each other through expressions, forming a directed pipeline.

Here's a complete config.alloy that collects node metrics, Docker container metrics, Docker logs, and journal logs:

// ============================================================
// METRICS: Node Exporter (host metrics)
// ============================================================
prometheus.exporter.unix "default" {
  set_collectors = [
    "cpu", "diskstats", "filesystem", "loadavg",
    "meminfo", "netdev", "os", "time", "uname",
  ]
  filesystem {
    fs_types_exclude = "^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs|tmpfs)$"
  }
}

prometheus.scrape "node" {
  targets         = prometheus.exporter.unix.default.targets
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "15s"
}

// ============================================================
// METRICS: Docker container metrics (cAdvisor-style)
// ============================================================
prometheus.exporter.cadvisor "docker" {
  docker_host = "unix:///var/run/docker.sock"
}

prometheus.scrape "cadvisor" {
  targets         = prometheus.exporter.cadvisor.docker.targets
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
}

// ============================================================
// METRICS: Ship to Prometheus
// ============================================================
prometheus.remote_write "default" {
  endpoint {
    url = "http://localhost:9090/api/v1/write"
  }
  external_labels = {
    instance = env("HOSTNAME"),
  }
}

// ============================================================
// LOGS: Docker container logs
// ============================================================
discovery.docker "containers" {
  host = "unix:///var/run/docker.sock"
}

discovery.relabel "docker_logs" {
  targets = discovery.docker.containers.targets
  rule {
    source_labels = ["__meta_docker_container_name"]
    regex         = "/(.*)"
    target_label  = "container"
  }
  rule {
    source_labels = ["__meta_docker_container_label_com_docker_compose_service"]
    target_label  = "compose_service"
  }
}

loki.source.docker "containers" {
  host       = "unix:///var/run/docker.sock"
  targets    = discovery.relabel.docker_logs.output
  forward_to = [loki.process.docker_logs.receiver]
}

loki.process "docker_logs" {
  forward_to = [loki.write.default.receiver]
  stage.drop {
    expression = "(?i)healthcheck|health_check"
  }
  stage.static_labels {
    values = { source = "docker" }
  }
}

// ============================================================
// LOGS: Systemd journal
// ============================================================
loki.relabel "journal_labels" {
  forward_to = []
  rule {
    source_labels = ["__journal__systemd_unit"]
    target_label  = "unit"
  }
  rule {
    source_labels = ["__journal__hostname"]
    target_label  = "hostname"
  }
  rule {
    source_labels = ["__journal_priority_keyword"]
    target_label  = "level"
  }
}

loki.source.journal "system" {
  forward_to    = [loki.process.journal.receiver]
  max_age       = "12h"
  relabel_rules = loki.relabel.journal_labels.rules
  labels        = { source = "journal" }
}

loki.process "journal" {
  forward_to = [loki.write.default.receiver]
  stage.drop {
    source     = "level"
    expression = "debug"
  }
}

// ============================================================
// LOGS: Ship to Loki
// ============================================================
loki.write "default" {
  endpoint {
    url = "http://localhost:3100/loki/api/v1/push"
  }
}

How the Pipeline Works

Every block follows <type> "<label>" { ... }. Blocks wire together through expressions:

This forms a DAG. The built-in UI at port 12345 visualizes this graph, showing data flow and flagging unhealthy components.

Collecting Node Metrics, Docker Metrics, and Logs

Node Metrics

The prometheus.exporter.unix component replaces the standalone Node Exporter binary. Same metrics, no separate process. Use these PromQL queries in Grafana:

// CPU usage per host
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

// Memory usage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

// Disk usage
(1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100

Docker Container Metrics

The prometheus.exporter.cadvisor component provides container metrics without a separate cAdvisor container:

rate(container_cpu_usage_seconds_total[5m])          // CPU
container_memory_working_set_bytes                    // Memory
rate(container_network_receive_bytes_total[5m])       // Network

Custom Scrape Targets

Add scrape targets for services that expose Prometheus metrics:

prometheus.scrape "traefik" {
  targets = [{
    __address__ = "traefik:8080",
    job         = "traefik",
  }]
  forward_to   = [prometheus.remote_write.default.receiver]
  metrics_path = "/metrics"
}

Dashboard Setup in Grafana

Open Grafana at http://your-server:3000 and add data sources for Prometheus (http://prometheus:9090) and Loki (http://loki:3100).

Import Community Dashboards

Go to Dashboards > Import and use these IDs:

Build a Homelab Overview

Create panels for a single-pane-of-glass view:

// All hosts CPU (Time series)
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

// Memory by host (Gauge, thresholds: green <70%, yellow 70-85%, red >85%)
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

// Container count (Stat)
count(container_memory_working_set_bytes{name!=""})

For the log panels, use the Loki data source:

// Error rate by host (Time series)
sum by (hostname) (rate({level=~"err|crit|alert|emerg"}[5m]))

// Recent errors (Logs panel)
{level=~"err|crit|alert|emerg"}

Alloy vs Telegraf vs Vector vs OTEL Collector

Feature Grafana Alloy Telegraf Vector OTEL Collector
Metrics Native Prometheus + OTLP 300+ input plugins Prometheus + custom OTLP native
Logs Native Loki + OTLP File tail, syslog Excellent pipeline OTLP logs
Traces OTLP native Limited Limited OTLP native
Config River (HCL-like) TOML TOML YAML
Built-in UI Yes (port 12345) No No No
Resource usage Low-moderate Low Very low Low
Grafana integration Native Manual Manual Manual

Choose Alloy if you use the Grafana stack. The integration is seamless, and one binary replaces three agents.

Choose Telegraf if you use InfluxDB or need a specific input plugin from its massive ecosystem.

Choose Vector if you have a log-heavy pipeline and need maximum throughput with minimal resources.

Choose OTEL Collector if you need vendor-neutral telemetry that can switch backends easily.

Tips and Best Practices

Use the built-in UI for debugging. Open http://alloy-host:12345 to see the component graph and check for errors. When something isn't working, the UI tells you exactly which component is failing and why.

Keep label cardinality low. Don't add labels with unbounded values like user IDs or request paths. Stick to instance, job, container, and level.

Prefer remote write over scrape. Alloy pushes metrics to Prometheus, so you don't need to configure Prometheus with every Alloy instance's address. Each instance just pushes to the Prometheus URL.

Drop noisy logs early. Use loki.process with stage.drop to filter health checks and debug noise before they reach Loki. Saves storage and makes logs more useful.

Pin image versions. Use specific tags (e.g., grafana/alloy:v1.5.1) instead of latest so updates don't break your stack unexpectedly.

Monitor Alloy itself. Add a self-scrape to catch issues with the collector:

prometheus.scrape "alloy_internal" {
  targets    = [{ __address__ = "localhost:12345", job = "alloy" }]
  forward_to = [prometheus.remote_write.default.receiver]
}

Start with stability.level=generally-available. Alloy has three stability tiers: GA, public-preview, and experimental. Stick with GA for production. Upgrade the stability flag only when you need a specific preview component.

Conclusion

Grafana Alloy collapses the observability agent sprawl that plagues homelab monitoring setups. Instead of maintaining Node Exporter, Promtail, and cAdvisor on every machine, you deploy one binary with one config file. Metrics, logs, and traces flow through a single pipeline you can visualize and debug from Alloy's built-in UI.

The migration path from an existing setup is straightforward — Alloy's unix exporter produces the same metric names as Node Exporter, so your dashboards and alerts work without changes. The River configuration language takes some getting used to coming from YAML, but the ability to reference component outputs as expressions makes complex pipelines cleaner than any YAML-based alternative.

One systemd service or Docker container per machine. One config format. One UI to check when things look wrong. That's the kind of consolidation that makes running a homelab sustainable.

Get free weekly tips in your inbox. Subscribe to HomeLab Starter