homelab/terraform/proxmox-examples/docker-host/README.md
Claude 21d373a513
feat: Add GPU passthrough and NFS mount support to Terraform
- Added GPU passthrough configuration for NVIDIA GTX 1070
  - Dynamic hostpci block with OVMF BIOS and q35 machine type
  - EFI disk support when GPU is enabled
  - Configurable via enable_gpu_passthrough and gpu_pci_id variables

- Added NFS mount support for Proxmox host media directories
  - Mounts 11 media directories from Proxmox host to VM
  - Configurable source path and mount point
  - Persistent mounts via /etc/fstab
  - NFS client installation via cloud-init

- Added multi-OS support (Ubuntu, AlmaLinux, Debian)
  - Separate cloud-init templates for Ubuntu and AlmaLinux
  - OS-specific package installation (apt vs dnf)
  - OS type validation via variable

- Updated terraform.tfvars.example with new configuration options
- Updated README.md with comprehensive documentation:
  - AlmaLinux cloud template creation steps
  - GPU passthrough setup for AMD Ryzen + NVIDIA
  - NFS server configuration on Proxmox host
  - Troubleshooting for GPU and NFS issues
2025-11-11 05:39:37 +00:00

13 KiB

Docker Host VM with OpenTofu

This configuration creates a VM optimized for running Docker containers in your homelab with support for GPU passthrough and NFS media mounts.

What This Creates

  • Ubuntu or AlmaLinux VM (from cloud template)
  • Docker & Docker Compose installed
  • Homelab network created
  • /media directories structure
  • SSH key authentication
  • Automatic updates enabled
  • Optional GPU passthrough (NVIDIA GTX 1070)
  • Optional NFS mounts from Proxmox host

Prerequisites

1. Create Ubuntu Cloud Template

First, create a cloud-init enabled template in Proxmox:

# SSH to Proxmox server
ssh root@proxmox.local

# Download Ubuntu cloud image
wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

# Create VM
qm create 9000 --name ubuntu-cloud-template --memory 2048 --net0 virtio,bridge=vmbr0

# Import disk
qm importdisk 9000 jammy-server-cloudimg-amd64.img local-lvm

# Attach disk
qm set 9000 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9000-disk-0

# Add cloud-init drive
qm set 9000 --ide2 local-lvm:cloudinit

# Set boot disk
qm set 9000 --boot c --bootdisk scsi0

# Add serial console
qm set 9000 --serial0 socket --vga serial0

# Convert to template
qm template 9000

# Cleanup
rm jammy-server-cloudimg-amd64.img

Or create AlmaLinux 9.6 Cloud Template:

# SSH to Proxmox server
ssh root@proxmox.local

# Download AlmaLinux cloud image
wget https://repo.almalinux.org/almalinux/9/cloud/x86_64/images/AlmaLinux-9-GenericCloud-latest.x86_64.qcow2

# Create VM
qm create 9001 --name almalinux-cloud-template --memory 2048 --net0 virtio,bridge=vmbr0

# Import disk
qm importdisk 9001 AlmaLinux-9-GenericCloud-latest.x86_64.qcow2 local-lvm

# Attach disk
qm set 9001 --scsihw virtio-scsi-pci --scsi0 local-lvm:vm-9001-disk-0

# Add cloud-init drive
qm set 9001 --ide2 local-lvm:cloudinit

# Set boot disk
qm set 9001 --boot c --bootdisk scsi0

# Add serial console
qm set 9001 --serial0 socket --vga serial0

# Convert to template
qm template 9001

# Cleanup
rm AlmaLinux-9-GenericCloud-latest.x86_64.qcow2

2. (Optional) Enable GPU Passthrough

For NVIDIA GTX 1070 on AMD Ryzen CPU:

# On Proxmox host, edit GRUB config
nano /etc/default/grub

# Add to GRUB_CMDLINE_LINUX_DEFAULT:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

# Update GRUB
update-grub

# Load required kernel modules
nano /etc/modules

# Add these lines:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

# Blacklist NVIDIA drivers on host
nano /etc/modprobe.d/blacklist.conf

# Add:
blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist nvidia_drm

# Update initramfs
update-initramfs -u -k all

# Reboot Proxmox host
reboot

# After reboot, verify IOMMU is enabled:
dmesg | grep -e DMAR -e IOMMU

# Find GPU PCI ID:
lspci | grep -i nvidia
# Output example: 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070]
# Use: 0000:01:00 (note the format)

3. (Optional) Configure NFS Server on Proxmox

Export media directories from Proxmox host:

# On Proxmox host
# Install NFS server
apt update
apt install nfs-kernel-server -y

# Create /etc/exports entry
nano /etc/exports

# Add (replace 192.168.1.0/24 with your network):
/data/media/audiobooks 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/books 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/comics 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/complete 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/downloads 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/homemovies 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/incomplete 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/movies 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/music 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/photos 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/media/tv 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)

# Export NFS shares
exportfs -ra

# Enable and start NFS server
systemctl enable nfs-server
systemctl start nfs-server

# Verify exports
showmount -e localhost

4. Create API Token

In Proxmox UI:

  1. Datacenter → Permissions → API Tokens
  2. Add → User: root@pam, Token ID: terraform
  3. Uncheck "Privilege Separation"
  4. Save the secret!

5. Install OpenTofu

# Linux/macOS
curl --proto '=https' --tlsv1.2 -fsSL https://get.opentofu.org/install-opentofu.sh | sh

# Verify
tofu version

Quick Start

1. Configure Variables

cd terraform/proxmox-examples/docker-host

# Copy example config
cp terraform.tfvars.example terraform.tfvars

# Edit with your values
nano terraform.tfvars

Required changes:

  • pm_api_token_secret - Your Proxmox API secret
  • vm_ssh_keys - Your SSH public key
  • vm_password - Set a secure password

Optional changes:

  • vm_name - Change VM name
  • vm_cores / vm_memory - Adjust resources
  • vm_ip_address - Set static IP (or keep DHCP)
  • vm_os_type - Choose "ubuntu", "almalinux", or "debian"
  • template_vm_id - Use 9001 for AlmaLinux template
  • enable_gpu_passthrough - Set to true for GPU support
  • gpu_pci_id - Your GPU PCI ID (find with lspci)
  • mount_media_directories - Set to true for NFS mounts
  • proxmox_host_ip - IP for NFS server (Proxmox host)
  • media_source_path - Path on Proxmox host (default: /data/media)

2. Initialize

tofu init

3. Plan

tofu plan

Review what will be created.

4. Apply

tofu apply

Type yes to confirm.

5. Connect

# Get SSH command from output
tofu output ssh_command

# Or manually
ssh ubuntu@<VM-IP>

# Verify Docker
docker --version
docker ps
docker network ls | grep homelab

Configuration Options

Resource Sizing

Light workload (1-5 containers):

vm_cores  = 2
vm_memory = 4096
disk_size = "30"

Medium workload (5-15 containers):

vm_cores  = 4
vm_memory = 8192
disk_size = "50"

Heavy workload (15+ containers):

vm_cores  = 8
vm_memory = 16384
disk_size = "100"

Network Configuration

DHCP (easiest):

vm_ip_address = "dhcp"

Static IP:

vm_ip_address = "192.168.1.100"
vm_ip_netmask = 24
vm_gateway    = "192.168.1.1"

Multiple SSH Keys

vm_ssh_keys = [
  "ssh-rsa AAAAB3... user1@laptop",
  "ssh-rsa AAAAB3... user2@desktop"
]

GPU Passthrough Configuration

Enable NVIDIA GTX 1070 for Jellyfin, Ollama, Immich:

# Must complete Proxmox host GPU passthrough setup first
enable_gpu_passthrough = true
gpu_pci_id = "0000:01:00"  # Find with: lspci | grep -i nvidia

# Use AlmaLinux for better GPU support
vm_os_type = "almalinux"
template_vm_id = 9001

# Allocate sufficient resources
vm_cores = 8
vm_memory = 24576  # 24GB

Verify GPU in VM after deployment:

ssh ubuntu@<VM-IP>

# Install NVIDIA drivers (AlmaLinux)
sudo dnf install -y epel-release
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
sudo dnf install -y nvidia-driver nvidia-container-toolkit

# Verify
nvidia-smi
docker run --rm --gpus all nvidia/cuda:12.3.0-base-ubuntu22.04 nvidia-smi

NFS Media Mounts Configuration

Mount Proxmox host media directories to VM:

# Enable NFS mounts from Proxmox host
mount_media_directories = true

# Proxmox host IP (not API URL)
proxmox_host_ip = "192.168.1.100"

# Source path on Proxmox host
media_source_path = "/data/media"

# Mount point in VM
media_mount_path = "/media"

After deployment, verify mounts:

ssh ubuntu@<VM-IP>

# Check mounts
df -h | grep /media
ls -la /media

# Expected directories:
# /media/audiobooks, /media/books, /media/comics,
# /media/complete, /media/downloads, /media/homemovies,
# /media/incomplete, /media/movies, /media/music,
# /media/photos, /media/tv

Operating System Selection

AlmaLinux 9.6 (Recommended for GPU):

vm_os_type = "almalinux"
template_vm_id = 9001
vm_username = "almalinux"  # Default AlmaLinux user

Ubuntu 22.04 LTS:

vm_os_type = "ubuntu"
template_vm_id = 9000
vm_username = "ubuntu"

Key differences:

  • AlmaLinux: Better RHEL ecosystem, SELinux, dnf package manager
  • Ubuntu: Wider community support, apt package manager
  • Both support Docker, GPU passthrough, and NFS mounts

Post-Deployment

Deploy Homelab Services

# SSH to VM
ssh ubuntu@<VM-IP>

# Clone homelab repo (if not auto-cloned)
git clone https://github.com/efigueroa/homelab.git
cd homelab

# Deploy services
cd compose/core/traefik
docker compose up -d

cd ../lldap
docker compose up -d

# Continue with other services...

Verify Setup

# Check Docker
docker --version
docker compose version

# Check network
docker network ls | grep homelab

# Check media directories and NFS mounts
ls -la /media
df -h | grep /media

# If GPU passthrough is enabled
nvidia-smi
lspci | grep -i nvidia

# Check system resources
htop
df -h

Managing the VM

View State

tofu show
tofu state list

Update VM

  1. Edit terraform.tfvars:
vm_cores  = 8  # Increase from 4
vm_memory = 16384  # Increase from 8192
  1. Apply changes:
tofu plan
tofu apply

Note: Some changes require VM restart.

Destroy VM

# Backup any data first!
tofu destroy

Type yes to confirm deletion.

Troubleshooting

Template Not Found

Error: template with ID 9000 not found

Solution: Create cloud template (see Prerequisites)

API Permission Error

Error: permission denied

Solution: Check API token permissions:

pveum acl modify / -token 'root@pam!terraform' -role Administrator

Cloud-Init Not Working

Check cloud-init status:

ssh ubuntu@<VM-IP>
sudo cloud-init status
sudo cat /var/log/cloud-init-output.log

Docker Not Installed

Manual installation:

ssh ubuntu@<VM-IP>
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker ubuntu

VM Won't Start

Check Proxmox logs:

# On Proxmox server
qm status <VM-ID>
tail -f /var/log/pve/tasks/active

GPU Not Detected in VM

Verify IOMMU is enabled:

# On Proxmox host
dmesg | grep -e DMAR -e IOMMU
# Should show: "IOMMU enabled"

Check GPU is available:

# On Proxmox host
lspci | grep -i nvidia
lspci -n -s 01:00

# Verify it's not being used by host
lsmod | grep nvidia
# Should be empty (blacklisted)

In VM, install drivers:

# AlmaLinux
sudo dnf install -y epel-release
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
sudo dnf install -y nvidia-driver

# Ubuntu
sudo apt install -y nvidia-driver-535
sudo reboot

# Verify
nvidia-smi

NFS Mounts Not Working

On Proxmox host, verify NFS server:

systemctl status nfs-server
showmount -e localhost
# Should list all /data/media/* exports

In VM, test manual mount:

# Install NFS client if missing
sudo apt install nfs-common  # Ubuntu
sudo dnf install nfs-utils   # AlmaLinux

# Test mount
sudo mount -t nfs 192.168.1.100:/data/media/movies /mnt
ls /mnt
sudo umount /mnt

Check /etc/fstab in VM:

cat /etc/fstab | grep nfs
# Should show all media directory mounts

Firewall issues:

# On Proxmox host, allow NFS
ufw allow from 192.168.1.0/24 to any port nfs
# Or disable firewall temporarily to test:
systemctl stop ufw

Advanced Usage

Multiple VMs

Create docker-host-02.tfvars:

vm_name = "docker-host-02"
vm_ip_address = "192.168.1.101"

Deploy:

tofu apply -var-file="docker-host-02.tfvars"

Custom Cloud-Init

Edit main.tf to add custom cloud-init sections:

users:
  - name: myuser
    groups: sudo, docker
    shell: /bin/bash
    sudo: ALL=(ALL) NOPASSWD:ALL

packages:
  - zsh
  - tmux
  - neovim

runcmd:
  - sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"

Attach Additional Disk

Add to main.tf:

disk {
  datastore_id = var.storage
  size         = 200
  interface    = "scsi1"
}

Mount in cloud-init:

mounts:
  - ["/dev/sdb", "/mnt/data", "ext4", "defaults", "0", "0"]

Cost Analysis

Resource Usage:

  • 4 cores, 8GB RAM, 50GB disk
  • Running 24/7

Homelab Cost: $0 (uses existing hardware)

If in cloud (comparison):

  • AWS: ~$50-100/month
  • DigitalOcean: ~$40/month
  • Linode: ~$40/month

Homelab ROI: Pays for itself in ~2-3 months!

Security Hardening

Enable Firewall

Add to cloud-init:

runcmd:
  - ufw default deny incoming
  - ufw default allow outgoing
  - ufw allow ssh
  - ufw allow 80/tcp
  - ufw allow 443/tcp
  - ufw --force enable

Disable Password Authentication

After SSH key setup:

ssh_pwauth: false

Automatic Updates

Already enabled in cloud-init. Verify:

sudo systemctl status unattended-upgrades

Next Steps

  1. Deploy core services (Traefik, LLDAP, Tinyauth)
  2. Configure SSL certificates
  3. Deploy media services
  4. Set up backups (Restic)
  5. Add monitoring (Prometheus/Grafana)

Resources