docs: Add architecture docs and fix compose files for integration

This commit is contained in:
Claude 2025-11-10 11:32:13 +00:00
parent 9fbd003798
commit 07a8154fea
No known key found for this signature in database
6 changed files with 1610 additions and 4 deletions

108
README.md
View file

@ -2,6 +2,23 @@
This repository contains Docker Compose configurations for self-hosted home services.
## 💻 Hardware Specifications
- **Host**: Proxmox VE 9 (Debian 13)
- CPU: AMD Ryzen 5 7600X (6 cores, 12 threads, up to 5.3 GHz)
- GPU: NVIDIA GeForce GTX 1070 (8GB VRAM)
- RAM: 32GB DDR5
- **VM**: AlmaLinux 9.6 (RHEL 9 compatible)
- CPU: 8 vCPUs
- RAM: 24GB
- Storage: 500GB+ (expandable)
- GPU: GTX 1070 (PCIe passthrough)
**Documentation:**
- [Complete Architecture Guide](docs/architecture.md) - Integration, networking, logging, GPU setup
- [AlmaLinux VM Setup](docs/setup/almalinux-vm.md) - Full installation and configuration guide
## 🏗️ Infrastructure
### Core Services (Port 80/443)
@ -199,9 +216,21 @@ Each service has its own `.env` file where applicable. Key files to review:
- `core/lldap/.env` - LDAP configuration and admin credentials
- `core/tinyauth/.env` - LDAP connection and session settings
- `media/frontend/immich/.env` - Photo management configuration
- `services/linkwarden/.env` - Bookmark manager settings
- `services/karakeep/.env` - AI-powered bookmark manager
- `services/ollama/.env` - Local LLM configuration
- `services/microbin/.env` - Pastebin configuration
**Example Configuration Files:**
Several services include `.example` config files for reference:
- `media/automation/sonarr/config.xml.example`
- `media/automation/radarr/config.xml.example`
- `media/automation/sabnzbd/sabnzbd.ini.example`
- `media/automation/qbittorrent/qBittorrent.conf.example`
- `services/vikunja/config.yml.example`
- `services/FreshRSS/config.php.example`
Copy these to the appropriate location (usually `./config/`) and customize as needed.
## 🔧 Maintenance
### Viewing Logs
@ -241,6 +270,83 @@ Important data locations:
2. Check LLDAP connection in tinyauth logs
3. Verify LDAP bind credentials match in both services
### GPU not detected
1. Check GPU passthrough: `lspci | grep -i nvidia`
2. Verify drivers: `nvidia-smi`
3. Test in container: `docker exec ollama nvidia-smi`
4. See [AlmaLinux VM Setup](docs/setup/almalinux-vm.md) for GPU configuration
## 📊 Monitoring & Logging
### Centralized Logging (Loki + Promtail + Grafana)
All container logs are automatically collected and stored in Loki:
**Access Grafana**: https://logs.fig.systems
**Query examples:**
```logql
# View logs for specific service
{container="sonarr"}
# Filter by log level
{container="radarr"} |= "ERROR"
# Multiple services
{container=~"sonarr|radarr"}
# Search with JSON parsing
{container="karakeep"} |= "ollama" | json
```
**Retention**: 30 days (configurable in `compose/monitoring/logging/loki-config.yaml`)
### Uptime Monitoring (Uptime Kuma)
Monitor service availability and performance:
**Access Uptime Kuma**: https://status.fig.systems
**Features:**
- HTTP(s) monitoring for all web services
- Docker container health checks
- SSL certificate expiration alerts
- Public/private status pages
- 90+ notification integrations (Discord, Slack, Email, etc.)
### Service Integration
**How services integrate:**
```
Traefik (Reverse Proxy)
├─→ All services (SSL + routing)
└─→ Let's Encrypt (certificates)
Tinyauth (SSO)
├─→ LLDAP (user authentication)
└─→ Protected services (authorization)
Promtail (Log Collection)
├─→ Docker socket (all containers)
└─→ Loki (log storage)
Loki (Log Storage)
└─→ Grafana (visualization)
Karakeep (Bookmarks)
├─→ Ollama (AI tagging)
├─→ Meilisearch (search)
└─→ Chrome (web archiving)
Sonarr/Radarr (Media Automation)
├─→ SABnzbd/qBittorrent (downloads)
├─→ Jellyfin (media library)
└─→ Recyclarr/Profilarr (quality management)
```
See [Architecture Guide](docs/architecture.md) for complete integration details.
## 📄 License
This is a personal homelab configuration. Use at your own risk.

View file

@ -5,7 +5,36 @@ services:
booklore:
container_name: booklore
image: ghcr.io/lorebooks/booklore:latest
restart: unless-stopped
env_file:
- .env
volumes:
- ./data:/app/data
networks:
- homelab
labels:
# Traefik
traefik.enable: true
traefik.docker.network: homelab
# Web UI
traefik.http.routers.booklore.rule: Host(`booklore.fig.systems`) || Host(`booklore.edfig.dev`)
traefik.http.routers.booklore.entrypoints: websecure
traefik.http.routers.booklore.tls.certresolver: letsencrypt
traefik.http.services.booklore.loadbalancer.server.port: 3000
# SSO Protection
traefik.http.routers.booklore.middlewares: tinyauth
# Homarr Discovery
homarr.name: Booklore
homarr.group: Services
homarr.icon: mdi:book-open-variant
networks:
homelab:
external: true

View file

@ -5,17 +5,36 @@ services:
microbin:
container_name: microbin
image: danielszabo99/microbin:latest
env_file: .env
restart: unless-stopped
env_file:
- .env
volumes:
- ./data:/app/data
networks:
- homelab
labels:
# Traefik
traefik.enable: true
traefik.docker.network: homelab
# Web UI
traefik.http.routers.microbin.rule: Host(`paste.fig.systems`) || Host(`paste.edfig.dev`)
traefik.http.routers.microbin.entrypoints: websecure
traefik.http.routers.microbin.tls.certresolver: letsencrypt
traefik.http.services.microbin.loadbalancer.server.port: 8080
# Note: MicroBin has its own auth, SSO disabled by default
# traefik.http.routers.microbin.middlewares: tinyauth
# Homarr Discovery
homarr.name: MicroBin
homarr.group: Services
homarr.icon: mdi:content-paste
networks:
homelab:
external: true

View file

@ -6,7 +6,36 @@ services:
container_name: rsshub
# Using chromium-bundled image for full puppeteer support
image: diygod/rsshub:chromium-bundled
restart: unless-stopped
env_file:
- .env
volumes:
- ./data:/app/data
networks:
- homelab
labels:
# Traefik
traefik.enable: true
traefik.docker.network: homelab
# Web UI
traefik.http.routers.rsshub.rule: Host(`rsshub.fig.systems`) || Host(`rsshub.edfig.dev`)
traefik.http.routers.rsshub.entrypoints: websecure
traefik.http.routers.rsshub.tls.certresolver: letsencrypt
traefik.http.services.rsshub.loadbalancer.server.port: 1200
# Note: RSSHub is public by design, SSO disabled
# traefik.http.routers.rsshub.middlewares: tinyauth
# Homarr Discovery
homarr.name: RSSHub
homarr.group: Services
homarr.icon: mdi:rss-box
networks:
homelab:
external: true

648
docs/architecture.md Normal file
View file

@ -0,0 +1,648 @@
# Homelab Architecture & Integration
Complete integration guide for the homelab setup on AlmaLinux 9.6.
## 🖥️ Hardware Specifications
### Host System
- **Hypervisor**: Proxmox VE 9 (Debian 13 based)
- **CPU**: AMD Ryzen 5 7600X (6 cores, 12 threads, up to 5.3 GHz)
- **GPU**: NVIDIA GeForce GTX 1070 (8GB VRAM, 1920 CUDA cores)
- **RAM**: 32GB DDR5
### VM Configuration
- **OS**: AlmaLinux 9.6 (RHEL 9 compatible)
- **CPU**: 8 vCPUs (allocated from host)
- **RAM**: 24GB (leaving 8GB for host)
- **Storage**: 500GB+ (adjust based on media library size)
- **GPU**: GTX 1070 (PCIe passthrough from Proxmox)
## 🏗️ Architecture Overview
### Network Architecture
```
Internet
[Router/Firewall]
↓ (Port 80/443)
[Traefik Reverse Proxy]
┌──────────────────────────────────────┐
│ homelab network │
│ (Docker bridge - 172.18.0.0/16) │
│ │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ Core │ │ Media │ │
│ │ - Traefik │ │ - Jellyfin │ │
│ │ - LLDAP │ │ - Sonarr │ │
│ │ - Tinyauth │ │ - Radarr │ │
│ └─────────────┘ └──────────────┘ │
│ │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ Services │ │ Monitoring │ │
│ │ - Karakeep │ │ - Loki │ │
│ │ - Ollama │ │ - Promtail │ │
│ │ - Vikunja │ │ - Grafana │ │
│ └─────────────┘ └──────────────┘ │
└──────────────────────────────────────┘
[Promtail Agent]
[Loki Storage]
```
### Service Internal Networks
Services with databases use isolated internal networks:
```
karakeep
├── homelab (external traffic)
└── karakeep_internal
├── karakeep (app)
├── karakeep-chrome (browser)
└── karakeep-meilisearch (search)
vikunja
├── homelab (external traffic)
└── vikunja_internal
├── vikunja (app)
└── vikunja-db (postgres)
monitoring/logging
├── homelab (external traffic)
└── logging_internal
├── loki (storage)
├── promtail (collector)
└── grafana (UI)
```
## 🔐 Security Architecture
### Authentication Flow
```
User Request
[Traefik] → Check route rules
[Tinyauth Middleware] → Forward Auth
[LLDAP] → Verify credentials
[Backend Service] → Authorized access
```
### SSL/TLS
- **Certificate Provider**: Let's Encrypt
- **Challenge Type**: HTTP-01 (ports 80/443)
- **Automatic Renewal**: Via Traefik
- **Domains**:
- Primary: `*.fig.systems`
- Fallback: `*.edfig.dev`
### SSO Protection
**Protected Services** (require authentication):
- Traefik Dashboard
- LLDAP
- Sonarr, Radarr, SABnzbd, qBittorrent
- Profilarr, Recyclarr (monitoring)
- Homarr, Backrest
- Karakeep, Vikunja, LubeLogger
- Calibre-web, Booklore, FreshRSS, File Browser
- Loki API, Ollama API
**Unprotected Services** (own authentication):
- Tinyauth (SSO provider itself)
- Jellyfin (own user system)
- Jellyseerr (linked to Jellyfin)
- Immich (own user system)
- RSSHub (public feed generator)
- MicroBin (public pastebin)
- Grafana (own authentication)
- Uptime Kuma (own authentication)
## 📊 Logging Architecture
### Centralized Logging with Loki
All services forward logs to Loki via Promtail:
```
[Docker Container] → stdout/stderr
[Docker Socket] → /var/run/docker.sock
[Promtail] → Scrapes logs via Docker API
[Loki] → Stores and indexes logs
[Grafana] → Query and visualize
```
### Log Labels
Promtail automatically adds labels to all logs:
- `container`: Container name
- `compose_project`: Docker Compose project
- `compose_service`: Service name from compose
- `image`: Docker image name
- `stream`: stdout or stderr
### Log Retention
- **Default**: 30 days
- **Storage**: `compose/monitoring/logging/loki-data/`
- **Automatic cleanup**: Enabled via Loki compactor
### Querying Logs
**View all logs for a service:**
```logql
{container="sonarr"}
```
**Filter by log level:**
```logql
{container="radarr"} |= "ERROR"
```
**Multiple services:**
```logql
{container=~"sonarr|radarr"}
```
**Time range with filters:**
```logql
{container="karakeep"} |= "ollama" | json
```
## 🌐 Network Configuration
### Docker Networks
**homelab** (external bridge):
- Type: External bridge network
- Subnet: Auto-assigned by Docker
- Purpose: Inter-service communication + Traefik routing
- Create: `docker network create homelab`
**Service-specific internal networks**:
- `karakeep_internal`: Karakeep + Chrome + Meilisearch
- `vikunja_internal`: Vikunja + PostgreSQL
- `logging_internal`: Loki + Promtail + Grafana
- etc.
### Port Mappings
**External Ports** (exposed to host):
- `80/tcp`: HTTP (Traefik) - redirects to HTTPS
- `443/tcp`: HTTPS (Traefik)
- `6881/tcp+udp`: BitTorrent (qBittorrent)
**No other ports exposed** - all access via Traefik reverse proxy.
## 🔧 Traefik Integration
### Standard Traefik Labels
All services use consistent Traefik labels:
```yaml
labels:
# Enable Traefik
traefik.enable: true
traefik.docker.network: homelab
# Router configuration
traefik.http.routers.<service>.rule: Host(`<service>.fig.systems`) || Host(`<service>.edfig.dev`)
traefik.http.routers.<service>.entrypoints: websecure
traefik.http.routers.<service>.tls.certresolver: letsencrypt
# Service configuration (backend port)
traefik.http.services.<service>.loadbalancer.server.port: <port>
# SSO middleware (if protected)
traefik.http.routers.<service>.middlewares: tinyauth
# Homarr auto-discovery
homarr.name: <Service Name>
homarr.group: <Category>
homarr.icon: mdi:<icon-name>
```
### Middleware
**tinyauth** - Forward authentication:
```yaml
# Defined in traefik/compose.yaml
middlewares:
tinyauth:
forwardAuth:
address: http://tinyauth:8080
trustForwardHeader: true
```
## 💾 Volume Management
### Volume Types
**Bind Mounts** (host directories):
```yaml
volumes:
- ./data:/data # Service data
- ./config:/config # Configuration files
- /media:/media # Media library (shared)
```
**Named Volumes** (Docker-managed):
```yaml
volumes:
- loki-data:/loki # Loki storage
- postgres-data:/var/lib/postgresql/data
```
### Media Directory Structure
```
/media/
├── tv/ # TV shows (Sonarr → Jellyfin)
├── movies/ # Movies (Radarr → Jellyfin)
├── music/ # Music
├── photos/ # Photos (Immich)
├── books/ # Ebooks (Calibre-web)
├── audiobooks/ # Audiobooks
├── comics/ # Comics
├── homemovies/ # Home videos
├── downloads/ # Active downloads (SABnzbd/qBittorrent)
├── complete/ # Completed downloads
└── incomplete/ # In-progress downloads
```
### Backup Strategy
**Important directories to backup:**
```
compose/core/lldap/data/ # User directory
compose/core/traefik/letsencrypt/ # SSL certificates
compose/services/*/config/ # Service configurations
compose/services/*/data/ # Service data
compose/monitoring/logging/loki-data/ # Logs (optional)
/media/ # Media library
```
**Excluded from backups:**
```
compose/services/*/db/ # Databases (backup via dump)
compose/monitoring/logging/loki-data/ # Logs (can be recreated)
/media/downloads/ # Temporary downloads
/media/incomplete/ # Incomplete downloads
```
## 🎮 GPU Acceleration
### NVIDIA GTX 1070 Configuration
**GPU Passthrough (Proxmox → VM):**
1. **Proxmox host** (`/etc/pve/nodes/<node>/qemu-server/<vmid>.conf`):
```
hostpci0: 0000:01:00,pcie=1,x-vga=1
```
2. **VM (AlmaLinux)** - Install NVIDIA drivers:
```bash
# Add NVIDIA repository
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
# Install drivers
sudo dnf install nvidia-driver nvidia-settings
# Verify
nvidia-smi
```
3. **Docker** - Install NVIDIA Container Toolkit:
```bash
# Add NVIDIA Container Toolkit repo
sudo dnf config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
# Install toolkit
sudo dnf install nvidia-container-toolkit
# Configure Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# Verify
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
```
### Services Using GPU
**Jellyfin** (Hardware transcoding):
```yaml
# Uncomment in compose.yaml
devices:
- /dev/dri:/dev/dri # For NVENC/NVDEC
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=all
```
**Immich** (AI features):
```yaml
# Already configured
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
```
**Ollama** (LLM inference):
```yaml
# Uncomment in compose.yaml
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
```
### GPU Performance Tuning
**For Ryzen 5 7600X + GTX 1070:**
- **Jellyfin**: Can transcode 4-6 simultaneous 4K → 1080p streams
- **Ollama**:
- 3B models: 40-60 tokens/sec
- 7B models: 20-35 tokens/sec
- 13B models: 10-15 tokens/sec (quantized)
- **Immich**: AI tagging ~5-10 images/sec
## 🚀 Resource Allocation
### CPU Allocation (Ryzen 5 7600X - 6C/12T)
**High Priority** (4-6 cores):
- Jellyfin (transcoding)
- Sonarr/Radarr (media processing)
- Ollama (when running)
**Medium Priority** (2-4 cores):
- Immich (AI processing)
- Karakeep (bookmark processing)
- SABnzbd/qBittorrent (downloads)
**Low Priority** (1-2 cores):
- Traefik, LLDAP, Tinyauth
- Monitoring services
- Other utilities
### RAM Allocation (32GB Total, 24GB VM)
**Recommended allocation:**
```
Host (Proxmox): 8GB
VM Total: 24GB breakdown:
├── System: 4GB (AlmaLinux base)
├── Docker: 2GB (daemon overhead)
├── Jellyfin: 2-4GB (transcoding buffers)
├── Immich: 2-3GB (ML models + database)
├── Sonarr/Radarr: 1GB each
├── Ollama: 4-6GB (when running models)
├── Databases: 2-3GB total
├── Monitoring: 2GB (Loki + Grafana)
└── Other services: 4-5GB
```
### Disk Space Planning
**System:** 100GB
**Docker:** 50GB (images + containers)
**Service Data:** 50GB (configs, databases, logs)
**Media Library:** Remaining space (expandable)
**Recommended VM disk:**
- Minimum: 500GB (200GB system + 300GB media)
- Recommended: 1TB+ (allows room for growth)
## 🔄 Service Dependencies
### Startup Order
**Critical order for initial deployment:**
1. **Networks**: `docker network create homelab`
2. **Core** (must start first):
- Traefik (reverse proxy)
- LLDAP (user directory)
- Tinyauth (SSO provider)
3. **Monitoring** (optional but recommended):
- Loki + Promtail + Grafana
- Uptime Kuma
4. **Media Automation**:
- Sonarr, Radarr
- SABnzbd, qBittorrent
- Recyclarr, Profilarr
5. **Media Frontend**:
- Jellyfin
- Jellyseer
- Immich
6. **Services**:
- Karakeep, Ollama (AI features)
- Vikunja, Homarr
- All other services
### Service Integration Map
```
Traefik
├─→ All services (reverse proxy)
└─→ Let's Encrypt (SSL)
Tinyauth
├─→ LLDAP (authentication backend)
└─→ All SSO-protected services
LLDAP
└─→ User database for SSO
Promtail
├─→ Docker socket (log collection)
└─→ Loki (log forwarding)
Loki
└─→ Grafana (log visualization)
Karakeep
├─→ Ollama (AI tagging)
├─→ Meilisearch (search)
└─→ Chrome (web archiving)
Jellyseer
├─→ Jellyfin (media info)
├─→ Sonarr (TV requests)
└─→ Radarr (movie requests)
Sonarr/Radarr
├─→ SABnzbd/qBittorrent (downloads)
├─→ Jellyfin (media library)
└─→ Recyclarr/Profilarr (quality profiles)
Homarr
└─→ All services (dashboard auto-discovery)
```
## 🐛 Troubleshooting
### Check Service Health
```bash
# All services status
cd ~/homelab
docker ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Logs for specific service
docker logs <service-name> --tail 100 -f
# Logs via Loki/Grafana
# Go to https://logs.fig.systems
# Query: {container="<service-name>"}
```
### Network Issues
```bash
# Check homelab network exists
docker network ls | grep homelab
# Inspect network
docker network inspect homelab
# Test service connectivity
docker exec <service-a> ping <service-b>
docker exec karakeep curl http://ollama:11434
```
### GPU Not Detected
```bash
# Check GPU in VM
nvidia-smi
# Check Docker can access GPU
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
# Check service GPU allocation
docker exec jellyfin nvidia-smi
docker exec ollama nvidia-smi
```
### SSL Certificate Issues
```bash
# Check Traefik logs
docker logs traefik | grep -i certificate
# Force certificate renewal
docker exec traefik rm -rf /letsencrypt/acme.json
docker restart traefik
# Verify DNS
dig +short sonarr.fig.systems
```
### SSO Not Working
```bash
# Check Tinyauth status
docker logs tinyauth
# Check LLDAP connection
docker exec tinyauth nc -zv lldap 3890
docker exec tinyauth nc -zv lldap 17170
# Verify credentials match
grep LDAP_BIND_PASSWORD compose/core/tinyauth/.env
grep LLDAP_LDAP_USER_PASS compose/core/lldap/.env
```
## 📈 Monitoring Best Practices
### Key Metrics to Monitor
**System Level:**
- CPU usage per container
- Memory usage per container
- Disk I/O
- Network throughput
- GPU utilization (for Jellyfin/Ollama/Immich)
**Application Level:**
- Traefik request rate
- Failed authentication attempts
- Jellyfin concurrent streams
- Download speeds (SABnzbd/qBittorrent)
- Sonarr/Radarr queue size
### Uptime Kuma Monitoring
Configure monitors for:
- **HTTP(s)**: All web services (200 status check)
- **TCP**: Database ports (PostgreSQL, etc.)
- **Docker**: Container health (via Docker socket)
- **SSL**: Certificate expiration (30-day warning)
### Log Monitoring
Set up Loki alerts for:
- ERROR level logs
- Authentication failures
- Service crashes
- Disk space warnings
## 🔧 Maintenance Tasks
### Daily
- Check Uptime Kuma dashboard
- Review any critical alerts
### Weekly
- Check disk space: `df -h`
- Review failed downloads in Sonarr/Radarr
- Check Loki logs for errors
### Monthly
- Update all containers: `docker compose pull && docker compose up -d`
- Review and clean old Docker images: `docker image prune -a`
- Backup configurations
- Check SSL certificate renewal
### Quarterly
- Review and update documentation
- Clean up old media (if needed)
- Review and adjust quality profiles
- Update Recyclarr configurations
## 📚 Additional Resources
- [Traefik Documentation](https://doc.traefik.io/traefik/)
- [Docker Compose Best Practices](https://docs.docker.com/compose/production/)
- [Loki LogQL Guide](https://grafana.com/docs/loki/latest/logql/)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/)
- [Proxmox GPU Passthrough](https://pve.proxmox.com/wiki/PCI_Passthrough)
- [AlmaLinux Documentation](https://wiki.almalinux.org/)
---
**System Ready!** 🚀

775
docs/setup/almalinux-vm.md Normal file
View file

@ -0,0 +1,775 @@
# AlmaLinux 9.6 VM Setup Guide
Complete setup guide for the homelab VM on AlmaLinux 9.6 running on Proxmox VE 9.
## Hardware Context
- **Host**: Proxmox VE 9 (Debian 13 based)
- CPU: AMD Ryzen 5 7600X (6C/12T, 5.3 GHz boost)
- GPU: NVIDIA GTX 1070 (8GB VRAM)
- RAM: 32GB DDR5
- **VM Allocation**:
- OS: AlmaLinux 9.6 (RHEL 9 compatible)
- CPU: 8 vCPUs
- RAM: 24GB
- Disk: 500GB+ (expandable)
- GPU: GTX 1070 (PCIe passthrough)
## Proxmox VM Creation
### 1. Create VM
```bash
# On Proxmox host
qm create 100 \
--name homelab \
--memory 24576 \
--cores 8 \
--cpu host \
--sockets 1 \
--net0 virtio,bridge=vmbr0 \
--scsi0 local-lvm:500 \
--ostype l26 \
--boot order=scsi0
# Attach AlmaLinux ISO
qm set 100 --ide2 local:iso/AlmaLinux-9.6-x86_64-dvd.iso,media=cdrom
# Enable UEFI
qm set 100 --bios ovmf --efidisk0 local-lvm:1
```
### 2. GPU Passthrough
**Find GPU PCI address:**
```bash
lspci | grep -i nvidia
# Example output: 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070]
```
**Enable IOMMU in Proxmox:**
Edit `/etc/default/grub`:
```bash
# For AMD CPU (Ryzen 5 7600X)
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
```
Update GRUB and reboot:
```bash
update-grub
reboot
```
**Verify IOMMU:**
```bash
dmesg | grep -e DMAR -e IOMMU
# Should show IOMMU enabled
```
**Add GPU to VM:**
Edit `/etc/pve/qemu-server/100.conf`:
```
hostpci0: 0000:01:00,pcie=1,x-vga=1
```
Or via command:
```bash
qm set 100 --hostpci0 0000:01:00,pcie=1,x-vga=1
```
**Blacklist GPU on host:**
Edit `/etc/modprobe.d/blacklist-nvidia.conf`:
```
blacklist nouveau
blacklist nvidia
blacklist nvidia_drm
blacklist nvidia_modeset
blacklist nvidia_uvm
```
Update initramfs:
```bash
update-initramfs -u
reboot
```
## AlmaLinux Installation
### 1. Install AlmaLinux 9.6
Start VM and follow installer:
1. **Language**: English (US)
2. **Installation Destination**: Use all space, automatic partitioning
3. **Network**: Enable and set hostname to `homelab.fig.systems`
4. **Software Selection**: Minimal Install
5. **Root Password**: Set strong password
6. **User Creation**: Create admin user (e.g., `homelab`)
### 2. Post-Installation Configuration
```bash
# SSH into VM
ssh homelab@<vm-ip>
# Update system
sudo dnf update -y
# Install essential tools
sudo dnf install -y \
vim \
git \
curl \
wget \
htop \
ncdu \
tree \
tmux \
bind-utils \
net-tools \
firewalld
# Enable and configure firewall
sudo systemctl enable --now firewalld
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-service=https
sudo firewall-cmd --reload
```
### 3. Configure Static IP (Optional)
```bash
# Find connection name
nmcli connection show
# Set static IP (example: 192.168.1.100)
sudo nmcli connection modify "System eth0" \
ipv4.addresses 192.168.1.100/24 \
ipv4.gateway 192.168.1.1 \
ipv4.dns "1.1.1.1,8.8.8.8" \
ipv4.method manual
# Restart network
sudo nmcli connection down "System eth0"
sudo nmcli connection up "System eth0"
```
## Docker Installation
### 1. Install Docker Engine
```bash
# Remove old versions
sudo dnf remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
# Add Docker repository
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# Install Docker
sudo dnf install -y \
docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin
# Start Docker
sudo systemctl enable --now docker
# Verify
sudo docker run hello-world
```
### 2. Configure Docker
**Add user to docker group:**
```bash
sudo usermod -aG docker $USER
newgrp docker
# Verify (no sudo needed)
docker ps
```
**Configure Docker daemon:**
Create `/etc/docker/daemon.json`:
```json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"storage-driver": "overlay2",
"features": {
"buildkit": true
}
}
```
Restart Docker:
```bash
sudo systemctl restart docker
```
## NVIDIA GPU Setup
### 1. Install NVIDIA Drivers
```bash
# Add EPEL repository
sudo dnf install -y epel-release
# Add NVIDIA repository
sudo dnf config-manager --add-repo \
https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
# Install drivers
sudo dnf install -y \
nvidia-driver \
nvidia-driver-cuda \
nvidia-settings \
nvidia-persistenced
# Reboot to load drivers
sudo reboot
```
### 2. Verify GPU
```bash
# Check driver version
nvidia-smi
# Expected output:
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 535.xx.xx Driver Version: 535.xx.xx CUDA Version: 12.2 |
# |-------------------------------+----------------------+----------------------+
# | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
# | 0 GeForce GTX 1070 Off | 00000000:01:00.0 Off | N/A |
# +-------------------------------+----------------------+----------------------+
```
### 3. Install NVIDIA Container Toolkit
```bash
# Add NVIDIA Container Toolkit repository
sudo dnf config-manager --add-repo \
https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
# Install toolkit
sudo dnf install -y nvidia-container-toolkit
# Configure Docker to use nvidia runtime
sudo nvidia-ctk runtime configure --runtime=docker
# Restart Docker
sudo systemctl restart docker
# Test GPU in container
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
```
## Storage Setup
### 1. Create Media Directory
```bash
# Create media directory structure
sudo mkdir -p /media/{tv,movies,music,photos,books,audiobooks,comics,homemovies}
sudo mkdir -p /media/{downloads,complete,incomplete}
# Set ownership
sudo chown -R $USER:$USER /media
# Set permissions
chmod -R 755 /media
```
### 2. Mount Additional Storage (Optional)
If using separate disk for media:
```bash
# Find disk
lsblk
# Format disk (example: /dev/sdb)
sudo mkfs.ext4 /dev/sdb
# Get UUID
sudo blkid /dev/sdb
# Add to /etc/fstab
echo "UUID=<uuid> /media ext4 defaults,nofail 0 2" | sudo tee -a /etc/fstab
# Mount
sudo mount -a
```
## Homelab Repository Setup
### 1. Clone Repository
```bash
# Create workspace
mkdir -p ~/homelab
cd ~/homelab
# Clone repository
git clone https://github.com/efigueroa/homelab.git .
# Or if using SSH
git clone git@github.com:efigueroa/homelab.git .
```
### 2. Create Docker Network
```bash
# Create homelab network
docker network create homelab
# Verify
docker network ls | grep homelab
```
### 3. Configure Environment Variables
```bash
# Generate secrets for all services
cd ~/homelab
# LLDAP
cd compose/core/lldap
openssl rand -hex 32 > /tmp/lldap_jwt_secret
openssl rand -base64 32 | tr -d /=+ | cut -c1-32 > /tmp/lldap_pass
# Update .env with generated secrets
# Tinyauth
cd ../tinyauth
openssl rand -hex 32 > /tmp/tinyauth_session
# Update .env (LDAP_BIND_PASSWORD must match LLDAP)
# Continue for all services...
```
See [`docs/guides/secrets-management.md`](../guides/secrets-management.md) for complete guide.
## SELinux Configuration
AlmaLinux uses SELinux by default. Configure for Docker:
```bash
# Check SELinux status
getenforce
# Should show: Enforcing
# Allow Docker to access bind mounts
sudo setsebool -P container_manage_cgroup on
# If you encounter permission issues:
# Option 1: Add SELinux context to directories
sudo chcon -R -t container_file_t ~/homelab/compose
sudo chcon -R -t container_file_t /media
# Option 2: Use :Z flag in docker volumes (auto-relabels)
# Example: ./data:/data:Z
# Option 3: Set SELinux to permissive (not recommended)
# sudo setenforce 0
```
## System Tuning
### 1. Increase File Limits
```bash
# Add to /etc/security/limits.conf
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf
# Add to /etc/sysctl.conf
echo "fs.file-max = 65536" | sudo tee -a /etc/sysctl.conf
echo "fs.inotify.max_user_watches = 524288" | sudo tee -a /etc/sysctl.conf
# Apply
sudo sysctl -p
```
### 2. Optimize for Media Server
```bash
# Network tuning
echo "net.core.rmem_max = 134217728" | sudo tee -a /etc/sysctl.conf
echo "net.core.wmem_max = 134217728" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_rmem = 4096 87380 67108864" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_wmem = 4096 65536 67108864" | sudo tee -a /etc/sysctl.conf
# Apply
sudo sysctl -p
```
### 3. CPU Governor (Ryzen 5 7600X)
```bash
# Install cpupower
sudo dnf install -y kernel-tools
# Set to performance mode
sudo cpupower frequency-set -g performance
# Make permanent
echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
```
## Deployment
### 1. Deploy Core Services
```bash
cd ~/homelab
# Create network
docker network create homelab
# Deploy Traefik
cd compose/core/traefik
docker compose up -d
# Deploy LLDAP
cd ../lldap
docker compose up -d
# Wait for LLDAP to be ready (30 seconds)
sleep 30
# Deploy Tinyauth
cd ../tinyauth
docker compose up -d
```
### 2. Configure LLDAP
```bash
# Access LLDAP web UI
# https://lldap.fig.systems
# 1. Login with admin credentials from .env
# 2. Create observer user for tinyauth
# 3. Create regular users
```
### 3. Deploy Monitoring
```bash
cd ~/homelab
# Deploy logging stack
cd compose/monitoring/logging
docker compose up -d
# Deploy uptime monitoring
cd ../uptime
docker compose up -d
```
### 4. Deploy Services
See [`README.md`](../../README.md) for complete deployment order.
## Verification
### 1. Check All Services
```bash
# List all running containers
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Check networks
docker network ls
# Check volumes
docker volume ls
```
### 2. Test GPU Access
```bash
# Test in Jellyfin
docker exec jellyfin nvidia-smi
# Test in Ollama
docker exec ollama nvidia-smi
# Test in Immich
docker exec immich-machine-learning nvidia-smi
```
### 3. Test Logging
```bash
# Check Promtail is collecting logs
docker logs promtail | grep "clients configured"
# Access Grafana
# https://logs.fig.systems
# Query logs
# {container="traefik"}
```
### 4. Test SSL
```bash
# Check certificate
curl -vI https://sonarr.fig.systems 2>&1 | grep -i "subject:"
# Should show valid Let's Encrypt certificate
```
## Backup Strategy
### 1. VM Snapshots (Proxmox)
```bash
# On Proxmox host
# Create snapshot before major changes
qm snapshot 100 pre-update-$(date +%Y%m%d)
# List snapshots
qm listsnapshot 100
# Restore snapshot
qm rollback 100 <snapshot-name>
```
### 2. Configuration Backup
```bash
# On VM
cd ~/homelab
# Backup all configs (excludes data directories)
tar czf homelab-config-$(date +%Y%m%d).tar.gz \
--exclude='*/data' \
--exclude='*/db' \
--exclude='*/pgdata' \
--exclude='*/config' \
--exclude='*/models' \
--exclude='*_data' \
compose/
# Backup to external storage
scp homelab-config-*.tar.gz user@backup-server:/backups/
```
### 3. Automated Backups with Backrest
Backrest service is included and configured. See:
- `compose/services/backrest/`
- Access: https://backup.fig.systems
## Maintenance
### Weekly
```bash
# Update containers
cd ~/homelab
find compose -name "compose.yaml" -type f | while read compose; do
dir=$(dirname "$compose")
echo "Updating $dir"
cd "$dir"
docker compose pull
docker compose up -d
cd ~/homelab
done
# Clean up old images
docker image prune -a -f
# Check disk space
df -h
ncdu /media
```
### Monthly
```bash
# Update AlmaLinux
sudo dnf update -y
# Update NVIDIA drivers (if available)
sudo dnf update nvidia-driver* -y
# Reboot if kernel updated
sudo reboot
```
## Troubleshooting
### Services Won't Start
```bash
# Check SELinux denials
sudo ausearch -m avc -ts recent
# If SELinux is blocking:
sudo setsebool -P container_manage_cgroup on
# Or relabel directories
sudo restorecon -Rv ~/homelab/compose
```
### GPU Not Detected
```bash
# Check GPU is passed through
lspci | grep -i nvidia
# Check drivers loaded
lsmod | grep nvidia
# Reinstall drivers
sudo dnf reinstall nvidia-driver* -y
sudo reboot
```
### Network Issues
```bash
# Check firewall
sudo firewall-cmd --list-all
# Add ports if needed
sudo firewall-cmd --permanent --add-port=80/tcp
sudo firewall-cmd --permanent --add-port=443/tcp
sudo firewall-cmd --reload
# Check Docker network
docker network inspect homelab
```
### Permission Denied Errors
```bash
# Check ownership
ls -la ~/homelab/compose/*/
# Fix ownership
sudo chown -R $USER:$USER ~/homelab
# Check SELinux context
ls -Z ~/homelab/compose
# Fix SELinux labels
sudo chcon -R -t container_file_t ~/homelab/compose
```
## Performance Monitoring
### System Stats
```bash
# CPU usage
htop
# GPU usage
watch -n 1 nvidia-smi
# Disk I/O
iostat -x 1
# Network
iftop
# Per-container stats
docker stats
```
### Resource Limits
Example container resource limits:
```yaml
# In compose.yaml
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '1.0'
memory: 2G
```
## Security Hardening
### 1. Disable Root SSH
```bash
# Edit /etc/ssh/sshd_config
sudo sed -i 's/#PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
# Restart SSH
sudo systemctl restart sshd
```
### 2. Configure Fail2Ban
```bash
# Install
sudo dnf install -y fail2ban
# Configure
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
# Edit /etc/fail2ban/jail.local
# [sshd]
# enabled = true
# maxretry = 3
# bantime = 3600
# Start
sudo systemctl enable --now fail2ban
```
### 3. Automatic Updates
```bash
# Install dnf-automatic
sudo dnf install -y dnf-automatic
# Configure /etc/dnf/automatic.conf
# apply_updates = yes
# Enable
sudo systemctl enable --now dnf-automatic.timer
```
## Next Steps
1. ✅ VM created and AlmaLinux installed
2. ✅ Docker and NVIDIA drivers configured
3. ✅ Homelab repository cloned
4. ✅ Network and storage configured
5. ⬜ Deploy core services
6. ⬜ Configure SSO
7. ⬜ Deploy all services
8. ⬜ Configure backups
9. ⬜ Set up monitoring
---
**System ready for deployment!** 🚀