Stirling PDF: Self-Hosted PDF Tools for Your Homelab
Every time you need to merge two PDFs or compress a large document, you probably reach for a random website that promises free PDF tools. You upload your files, hope the service isn't harvesting your data, and download the result. For personal documents, tax forms, or business contracts, that's a privacy risk you don't need to take.
Photo by Toolmash Expo on Unsplash
Stirling PDF is a self-hosted web application that handles virtually every PDF operation you'd need — all running locally in your homelab. No data leaves your network, no accounts required, and no file size limits beyond your own hardware.
What Stirling PDF Can Do
The feature list is surprisingly comprehensive for a self-hosted tool:
- Merge and split — Combine multiple PDFs or extract specific page ranges
- Convert — PDF to/from images, Word, PowerPoint, HTML, and more
- OCR — Extract searchable text from scanned documents using Tesseract
- Compress — Reduce file sizes for email or archival
- Rotate, reorder, and remove pages — Visual drag-and-drop page management
- Add watermarks and stamps — Overlay text or images on pages
- Sign documents — Draw or upload signatures directly in the browser
- Flatten forms — Lock filled PDF form fields
- Extract images — Pull all embedded images from a PDF
- Compare documents — Visual diff between two PDF versions
- Password protect and unlock — Add or remove encryption
It's essentially a Swiss Army knife that replaces a dozen different online PDF tools.
Deploying with Docker Compose
The simplest way to run Stirling PDF is with Docker. Create a docker-compose.yml:
services:
stirling-pdf:
image: stirlingtools/stirling-pdf:latest
container_name: stirling-pdf
ports:
- "8080:8080"
volumes:
- ./training-data:/usr/share/tessdata
- ./configs:/configs
- ./logs:/logs
environment:
- DOCKER_ENABLE_SECURITY=false
- LANGS=en_GB
restart: unless-stopped
Start it up:
docker compose up -d
Navigate to http://your-server:8080 and you'll see the full interface immediately — no setup wizard, no account creation.
Adding OCR Support
The base image handles most operations, but OCR requires Tesseract language data. To add English OCR support:
mkdir -p training-data
cd training-data
wget https://github.com/tesseract-ocr/tessdata_best/raw/main/eng.traineddata
For additional languages, download the corresponding .traineddata files. The tessdata_best repository has high-accuracy models for over 100 languages. After adding the files, restart the container:
docker compose restart
Now when you use the OCR feature, Stirling PDF will extract searchable text from scanned documents — useful for digitizing paper records or making old PDFs searchable.
Like what you're reading? Subscribe to HomeLab Starter — free weekly guides in your inbox.
Enabling Authentication
If you're exposing Stirling PDF beyond localhost (even within your LAN), consider enabling the built-in authentication:
environment:
- DOCKER_ENABLE_SECURITY=true
- SECURITY_INITIALLOGIN_USERNAME=admin
- SECURITY_INITIALLOGIN_PASSWORD=your-secure-password
This adds a login page and user management. You can create multiple accounts with different permission levels. For a more robust approach, place Stirling PDF behind your existing reverse proxy (Nginx Proxy Manager, Traefik, or Caddy) and use your SSO provider like Authelia or Authentik.
Practical Workflows
Here are some real scenarios where a self-hosted PDF tool earns its place:
Tax Season Document Processing
Merge all your W-2s, 1099s, and receipts into a single PDF for your accountant. Use OCR on any scanned receipts first, then merge everything in order. The result is a searchable, organized file that never touched a third-party server.
Scanning and Archiving
Pair Stirling PDF with a document scanner or phone scanning app. Scan documents, upload to Stirling PDF for OCR processing, then store the searchable PDFs in your Paperless-ngx instance for long-term archival with full-text search.
Contract and Form Handling
Receive a PDF form, fill it out in your browser using Stirling PDF's form tools, add your signature, flatten it so fields can't be edited, and send it back. The entire workflow stays on your hardware.
Batch Processing via the API
Stirling PDF exposes a REST API for every operation. Automate repetitive tasks with simple curl commands:
# Compress a PDF via the API
curl -X POST "http://localhost:8080/api/v1/misc/compress-pdf" \
-F "[email protected]" \
-F "optimizeLevel=3" \
-o compressed-document.pdf
This makes it easy to integrate into scripts — for example, automatically compressing PDFs that land in a specific folder.
Resource Usage
Stirling PDF is lightweight for most operations. Expect around 200-400 MB of RAM at idle with occasional spikes during heavy operations like OCR on large documents. CPU usage scales with the operation — simple merges are instant, while OCR on a 100-page scanned document will peg a core for a few minutes.
For a typical homelab, any machine that can run Docker handles Stirling PDF without issue. It runs well on Raspberry Pi 4/5 hardware for light use, though OCR on large batches benefits from more CPU power.
Keeping It Updated
Stirling PDF is actively developed with frequent releases. Update with:
docker compose pull
docker compose up -d
Your configuration and training data persist in the mounted volumes, so updates are seamless.
Putting It All Together
Stirling PDF fills a gap that most homelabbers don't realize they have until they stop and think about how often they use random online PDF tools. Once you have it running locally, those "quick" trips to sketchy PDF websites stop entirely. Your documents stay private, processing is fast on local hardware, and you get more features than most paid PDF services offer.
Combined with Paperless-ngx for document management and a scanning solution, Stirling PDF completes a fully self-hosted document pipeline. Install it once, and it quietly handles every PDF task that comes your way.
