docs: Add architecture docs and fix compose files for integration
This commit is contained in:
parent
9fbd003798
commit
07a8154fea
6 changed files with 1610 additions and 4 deletions
108
README.md
108
README.md
|
|
@ -2,6 +2,23 @@
|
|||
|
||||
This repository contains Docker Compose configurations for self-hosted home services.
|
||||
|
||||
## 💻 Hardware Specifications
|
||||
|
||||
- **Host**: Proxmox VE 9 (Debian 13)
|
||||
- CPU: AMD Ryzen 5 7600X (6 cores, 12 threads, up to 5.3 GHz)
|
||||
- GPU: NVIDIA GeForce GTX 1070 (8GB VRAM)
|
||||
- RAM: 32GB DDR5
|
||||
|
||||
- **VM**: AlmaLinux 9.6 (RHEL 9 compatible)
|
||||
- CPU: 8 vCPUs
|
||||
- RAM: 24GB
|
||||
- Storage: 500GB+ (expandable)
|
||||
- GPU: GTX 1070 (PCIe passthrough)
|
||||
|
||||
**Documentation:**
|
||||
- [Complete Architecture Guide](docs/architecture.md) - Integration, networking, logging, GPU setup
|
||||
- [AlmaLinux VM Setup](docs/setup/almalinux-vm.md) - Full installation and configuration guide
|
||||
|
||||
## 🏗️ Infrastructure
|
||||
|
||||
### Core Services (Port 80/443)
|
||||
|
|
@ -199,9 +216,21 @@ Each service has its own `.env` file where applicable. Key files to review:
|
|||
- `core/lldap/.env` - LDAP configuration and admin credentials
|
||||
- `core/tinyauth/.env` - LDAP connection and session settings
|
||||
- `media/frontend/immich/.env` - Photo management configuration
|
||||
- `services/linkwarden/.env` - Bookmark manager settings
|
||||
- `services/karakeep/.env` - AI-powered bookmark manager
|
||||
- `services/ollama/.env` - Local LLM configuration
|
||||
- `services/microbin/.env` - Pastebin configuration
|
||||
|
||||
**Example Configuration Files:**
|
||||
Several services include `.example` config files for reference:
|
||||
- `media/automation/sonarr/config.xml.example`
|
||||
- `media/automation/radarr/config.xml.example`
|
||||
- `media/automation/sabnzbd/sabnzbd.ini.example`
|
||||
- `media/automation/qbittorrent/qBittorrent.conf.example`
|
||||
- `services/vikunja/config.yml.example`
|
||||
- `services/FreshRSS/config.php.example`
|
||||
|
||||
Copy these to the appropriate location (usually `./config/`) and customize as needed.
|
||||
|
||||
## 🔧 Maintenance
|
||||
|
||||
### Viewing Logs
|
||||
|
|
@ -241,6 +270,83 @@ Important data locations:
|
|||
2. Check LLDAP connection in tinyauth logs
|
||||
3. Verify LDAP bind credentials match in both services
|
||||
|
||||
### GPU not detected
|
||||
1. Check GPU passthrough: `lspci | grep -i nvidia`
|
||||
2. Verify drivers: `nvidia-smi`
|
||||
3. Test in container: `docker exec ollama nvidia-smi`
|
||||
4. See [AlmaLinux VM Setup](docs/setup/almalinux-vm.md) for GPU configuration
|
||||
|
||||
## 📊 Monitoring & Logging
|
||||
|
||||
### Centralized Logging (Loki + Promtail + Grafana)
|
||||
|
||||
All container logs are automatically collected and stored in Loki:
|
||||
|
||||
**Access Grafana**: https://logs.fig.systems
|
||||
|
||||
**Query examples:**
|
||||
```logql
|
||||
# View logs for specific service
|
||||
{container="sonarr"}
|
||||
|
||||
# Filter by log level
|
||||
{container="radarr"} |= "ERROR"
|
||||
|
||||
# Multiple services
|
||||
{container=~"sonarr|radarr"}
|
||||
|
||||
# Search with JSON parsing
|
||||
{container="karakeep"} |= "ollama" | json
|
||||
```
|
||||
|
||||
**Retention**: 30 days (configurable in `compose/monitoring/logging/loki-config.yaml`)
|
||||
|
||||
### Uptime Monitoring (Uptime Kuma)
|
||||
|
||||
Monitor service availability and performance:
|
||||
|
||||
**Access Uptime Kuma**: https://status.fig.systems
|
||||
|
||||
**Features:**
|
||||
- HTTP(s) monitoring for all web services
|
||||
- Docker container health checks
|
||||
- SSL certificate expiration alerts
|
||||
- Public/private status pages
|
||||
- 90+ notification integrations (Discord, Slack, Email, etc.)
|
||||
|
||||
### Service Integration
|
||||
|
||||
**How services integrate:**
|
||||
|
||||
```
|
||||
Traefik (Reverse Proxy)
|
||||
├─→ All services (SSL + routing)
|
||||
└─→ Let's Encrypt (certificates)
|
||||
|
||||
Tinyauth (SSO)
|
||||
├─→ LLDAP (user authentication)
|
||||
└─→ Protected services (authorization)
|
||||
|
||||
Promtail (Log Collection)
|
||||
├─→ Docker socket (all containers)
|
||||
└─→ Loki (log storage)
|
||||
|
||||
Loki (Log Storage)
|
||||
└─→ Grafana (visualization)
|
||||
|
||||
Karakeep (Bookmarks)
|
||||
├─→ Ollama (AI tagging)
|
||||
├─→ Meilisearch (search)
|
||||
└─→ Chrome (web archiving)
|
||||
|
||||
Sonarr/Radarr (Media Automation)
|
||||
├─→ SABnzbd/qBittorrent (downloads)
|
||||
├─→ Jellyfin (media library)
|
||||
└─→ Recyclarr/Profilarr (quality management)
|
||||
```
|
||||
|
||||
See [Architecture Guide](docs/architecture.md) for complete integration details.
|
||||
|
||||
## 📄 License
|
||||
|
||||
This is a personal homelab configuration. Use at your own risk.
|
||||
|
|
|
|||
|
|
@ -5,7 +5,36 @@ services:
|
|||
booklore:
|
||||
container_name: booklore
|
||||
image: ghcr.io/lorebooks/booklore:latest
|
||||
restart: unless-stopped
|
||||
|
||||
env_file:
|
||||
|
||||
- .env
|
||||
|
||||
volumes:
|
||||
- ./data:/app/data
|
||||
|
||||
networks:
|
||||
- homelab
|
||||
|
||||
labels:
|
||||
# Traefik
|
||||
traefik.enable: true
|
||||
traefik.docker.network: homelab
|
||||
|
||||
# Web UI
|
||||
traefik.http.routers.booklore.rule: Host(`booklore.fig.systems`) || Host(`booklore.edfig.dev`)
|
||||
traefik.http.routers.booklore.entrypoints: websecure
|
||||
traefik.http.routers.booklore.tls.certresolver: letsencrypt
|
||||
traefik.http.services.booklore.loadbalancer.server.port: 3000
|
||||
|
||||
# SSO Protection
|
||||
traefik.http.routers.booklore.middlewares: tinyauth
|
||||
|
||||
# Homarr Discovery
|
||||
homarr.name: Booklore
|
||||
homarr.group: Services
|
||||
homarr.icon: mdi:book-open-variant
|
||||
|
||||
networks:
|
||||
homelab:
|
||||
external: true
|
||||
|
|
|
|||
|
|
@ -5,17 +5,36 @@ services:
|
|||
microbin:
|
||||
container_name: microbin
|
||||
image: danielszabo99/microbin:latest
|
||||
env_file: .env
|
||||
restart: unless-stopped
|
||||
|
||||
env_file:
|
||||
- .env
|
||||
|
||||
volumes:
|
||||
- ./data:/app/data
|
||||
|
||||
networks:
|
||||
- homelab
|
||||
|
||||
labels:
|
||||
# Traefik
|
||||
traefik.enable: true
|
||||
traefik.docker.network: homelab
|
||||
|
||||
# Web UI
|
||||
traefik.http.routers.microbin.rule: Host(`paste.fig.systems`) || Host(`paste.edfig.dev`)
|
||||
traefik.http.routers.microbin.entrypoints: websecure
|
||||
traefik.http.routers.microbin.tls.certresolver: letsencrypt
|
||||
traefik.http.services.microbin.loadbalancer.server.port: 8080
|
||||
|
||||
# Note: MicroBin has its own auth, SSO disabled by default
|
||||
# traefik.http.routers.microbin.middlewares: tinyauth
|
||||
|
||||
# Homarr Discovery
|
||||
homarr.name: MicroBin
|
||||
homarr.group: Services
|
||||
homarr.icon: mdi:content-paste
|
||||
|
||||
networks:
|
||||
homelab:
|
||||
external: true
|
||||
|
|
|
|||
|
|
@ -6,7 +6,36 @@ services:
|
|||
container_name: rsshub
|
||||
# Using chromium-bundled image for full puppeteer support
|
||||
image: diygod/rsshub:chromium-bundled
|
||||
restart: unless-stopped
|
||||
|
||||
env_file:
|
||||
|
||||
- .env
|
||||
|
||||
volumes:
|
||||
- ./data:/app/data
|
||||
|
||||
networks:
|
||||
- homelab
|
||||
|
||||
labels:
|
||||
# Traefik
|
||||
traefik.enable: true
|
||||
traefik.docker.network: homelab
|
||||
|
||||
# Web UI
|
||||
traefik.http.routers.rsshub.rule: Host(`rsshub.fig.systems`) || Host(`rsshub.edfig.dev`)
|
||||
traefik.http.routers.rsshub.entrypoints: websecure
|
||||
traefik.http.routers.rsshub.tls.certresolver: letsencrypt
|
||||
traefik.http.services.rsshub.loadbalancer.server.port: 1200
|
||||
|
||||
# Note: RSSHub is public by design, SSO disabled
|
||||
# traefik.http.routers.rsshub.middlewares: tinyauth
|
||||
|
||||
# Homarr Discovery
|
||||
homarr.name: RSSHub
|
||||
homarr.group: Services
|
||||
homarr.icon: mdi:rss-box
|
||||
|
||||
networks:
|
||||
homelab:
|
||||
external: true
|
||||
|
|
|
|||
648
docs/architecture.md
Normal file
648
docs/architecture.md
Normal file
|
|
@ -0,0 +1,648 @@
|
|||
# Homelab Architecture & Integration
|
||||
|
||||
Complete integration guide for the homelab setup on AlmaLinux 9.6.
|
||||
|
||||
## 🖥️ Hardware Specifications
|
||||
|
||||
### Host System
|
||||
- **Hypervisor**: Proxmox VE 9 (Debian 13 based)
|
||||
- **CPU**: AMD Ryzen 5 7600X (6 cores, 12 threads, up to 5.3 GHz)
|
||||
- **GPU**: NVIDIA GeForce GTX 1070 (8GB VRAM, 1920 CUDA cores)
|
||||
- **RAM**: 32GB DDR5
|
||||
|
||||
### VM Configuration
|
||||
- **OS**: AlmaLinux 9.6 (RHEL 9 compatible)
|
||||
- **CPU**: 8 vCPUs (allocated from host)
|
||||
- **RAM**: 24GB (leaving 8GB for host)
|
||||
- **Storage**: 500GB+ (adjust based on media library size)
|
||||
- **GPU**: GTX 1070 (PCIe passthrough from Proxmox)
|
||||
|
||||
## 🏗️ Architecture Overview
|
||||
|
||||
### Network Architecture
|
||||
|
||||
```
|
||||
Internet
|
||||
↓
|
||||
[Router/Firewall]
|
||||
↓ (Port 80/443)
|
||||
[Traefik Reverse Proxy]
|
||||
↓
|
||||
┌──────────────────────────────────────┐
|
||||
│ homelab network │
|
||||
│ (Docker bridge - 172.18.0.0/16) │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌──────────────┐ │
|
||||
│ │ Core │ │ Media │ │
|
||||
│ │ - Traefik │ │ - Jellyfin │ │
|
||||
│ │ - LLDAP │ │ - Sonarr │ │
|
||||
│ │ - Tinyauth │ │ - Radarr │ │
|
||||
│ └─────────────┘ └──────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌──────────────┐ │
|
||||
│ │ Services │ │ Monitoring │ │
|
||||
│ │ - Karakeep │ │ - Loki │ │
|
||||
│ │ - Ollama │ │ - Promtail │ │
|
||||
│ │ - Vikunja │ │ - Grafana │ │
|
||||
│ └─────────────┘ └──────────────┘ │
|
||||
└──────────────────────────────────────┘
|
||||
↓
|
||||
[Promtail Agent]
|
||||
↓
|
||||
[Loki Storage]
|
||||
```
|
||||
|
||||
### Service Internal Networks
|
||||
|
||||
Services with databases use isolated internal networks:
|
||||
|
||||
```
|
||||
karakeep
|
||||
├── homelab (external traffic)
|
||||
└── karakeep_internal
|
||||
├── karakeep (app)
|
||||
├── karakeep-chrome (browser)
|
||||
└── karakeep-meilisearch (search)
|
||||
|
||||
vikunja
|
||||
├── homelab (external traffic)
|
||||
└── vikunja_internal
|
||||
├── vikunja (app)
|
||||
└── vikunja-db (postgres)
|
||||
|
||||
monitoring/logging
|
||||
├── homelab (external traffic)
|
||||
└── logging_internal
|
||||
├── loki (storage)
|
||||
├── promtail (collector)
|
||||
└── grafana (UI)
|
||||
```
|
||||
|
||||
## 🔐 Security Architecture
|
||||
|
||||
### Authentication Flow
|
||||
|
||||
```
|
||||
User Request
|
||||
↓
|
||||
[Traefik] → Check route rules
|
||||
↓
|
||||
[Tinyauth Middleware] → Forward Auth
|
||||
↓
|
||||
[LLDAP] → Verify credentials
|
||||
↓
|
||||
[Backend Service] → Authorized access
|
||||
```
|
||||
|
||||
### SSL/TLS
|
||||
|
||||
- **Certificate Provider**: Let's Encrypt
|
||||
- **Challenge Type**: HTTP-01 (ports 80/443)
|
||||
- **Automatic Renewal**: Via Traefik
|
||||
- **Domains**:
|
||||
- Primary: `*.fig.systems`
|
||||
- Fallback: `*.edfig.dev`
|
||||
|
||||
### SSO Protection
|
||||
|
||||
**Protected Services** (require authentication):
|
||||
- Traefik Dashboard
|
||||
- LLDAP
|
||||
- Sonarr, Radarr, SABnzbd, qBittorrent
|
||||
- Profilarr, Recyclarr (monitoring)
|
||||
- Homarr, Backrest
|
||||
- Karakeep, Vikunja, LubeLogger
|
||||
- Calibre-web, Booklore, FreshRSS, File Browser
|
||||
- Loki API, Ollama API
|
||||
|
||||
**Unprotected Services** (own authentication):
|
||||
- Tinyauth (SSO provider itself)
|
||||
- Jellyfin (own user system)
|
||||
- Jellyseerr (linked to Jellyfin)
|
||||
- Immich (own user system)
|
||||
- RSSHub (public feed generator)
|
||||
- MicroBin (public pastebin)
|
||||
- Grafana (own authentication)
|
||||
- Uptime Kuma (own authentication)
|
||||
|
||||
## 📊 Logging Architecture
|
||||
|
||||
### Centralized Logging with Loki
|
||||
|
||||
All services forward logs to Loki via Promtail:
|
||||
|
||||
```
|
||||
[Docker Container] → stdout/stderr
|
||||
↓
|
||||
[Docker Socket] → /var/run/docker.sock
|
||||
↓
|
||||
[Promtail] → Scrapes logs via Docker API
|
||||
↓
|
||||
[Loki] → Stores and indexes logs
|
||||
↓
|
||||
[Grafana] → Query and visualize
|
||||
```
|
||||
|
||||
### Log Labels
|
||||
|
||||
Promtail automatically adds labels to all logs:
|
||||
- `container`: Container name
|
||||
- `compose_project`: Docker Compose project
|
||||
- `compose_service`: Service name from compose
|
||||
- `image`: Docker image name
|
||||
- `stream`: stdout or stderr
|
||||
|
||||
### Log Retention
|
||||
|
||||
- **Default**: 30 days
|
||||
- **Storage**: `compose/monitoring/logging/loki-data/`
|
||||
- **Automatic cleanup**: Enabled via Loki compactor
|
||||
|
||||
### Querying Logs
|
||||
|
||||
**View all logs for a service:**
|
||||
```logql
|
||||
{container="sonarr"}
|
||||
```
|
||||
|
||||
**Filter by log level:**
|
||||
```logql
|
||||
{container="radarr"} |= "ERROR"
|
||||
```
|
||||
|
||||
**Multiple services:**
|
||||
```logql
|
||||
{container=~"sonarr|radarr"}
|
||||
```
|
||||
|
||||
**Time range with filters:**
|
||||
```logql
|
||||
{container="karakeep"} |= "ollama" | json
|
||||
```
|
||||
|
||||
## 🌐 Network Configuration
|
||||
|
||||
### Docker Networks
|
||||
|
||||
**homelab** (external bridge):
|
||||
- Type: External bridge network
|
||||
- Subnet: Auto-assigned by Docker
|
||||
- Purpose: Inter-service communication + Traefik routing
|
||||
- Create: `docker network create homelab`
|
||||
|
||||
**Service-specific internal networks**:
|
||||
- `karakeep_internal`: Karakeep + Chrome + Meilisearch
|
||||
- `vikunja_internal`: Vikunja + PostgreSQL
|
||||
- `logging_internal`: Loki + Promtail + Grafana
|
||||
- etc.
|
||||
|
||||
### Port Mappings
|
||||
|
||||
**External Ports** (exposed to host):
|
||||
- `80/tcp`: HTTP (Traefik) - redirects to HTTPS
|
||||
- `443/tcp`: HTTPS (Traefik)
|
||||
- `6881/tcp+udp`: BitTorrent (qBittorrent)
|
||||
|
||||
**No other ports exposed** - all access via Traefik reverse proxy.
|
||||
|
||||
## 🔧 Traefik Integration
|
||||
|
||||
### Standard Traefik Labels
|
||||
|
||||
All services use consistent Traefik labels:
|
||||
|
||||
```yaml
|
||||
labels:
|
||||
# Enable Traefik
|
||||
traefik.enable: true
|
||||
traefik.docker.network: homelab
|
||||
|
||||
# Router configuration
|
||||
traefik.http.routers.<service>.rule: Host(`<service>.fig.systems`) || Host(`<service>.edfig.dev`)
|
||||
traefik.http.routers.<service>.entrypoints: websecure
|
||||
traefik.http.routers.<service>.tls.certresolver: letsencrypt
|
||||
|
||||
# Service configuration (backend port)
|
||||
traefik.http.services.<service>.loadbalancer.server.port: <port>
|
||||
|
||||
# SSO middleware (if protected)
|
||||
traefik.http.routers.<service>.middlewares: tinyauth
|
||||
|
||||
# Homarr auto-discovery
|
||||
homarr.name: <Service Name>
|
||||
homarr.group: <Category>
|
||||
homarr.icon: mdi:<icon-name>
|
||||
```
|
||||
|
||||
### Middleware
|
||||
|
||||
**tinyauth** - Forward authentication:
|
||||
```yaml
|
||||
# Defined in traefik/compose.yaml
|
||||
middlewares:
|
||||
tinyauth:
|
||||
forwardAuth:
|
||||
address: http://tinyauth:8080
|
||||
trustForwardHeader: true
|
||||
```
|
||||
|
||||
## 💾 Volume Management
|
||||
|
||||
### Volume Types
|
||||
|
||||
**Bind Mounts** (host directories):
|
||||
```yaml
|
||||
volumes:
|
||||
- ./data:/data # Service data
|
||||
- ./config:/config # Configuration files
|
||||
- /media:/media # Media library (shared)
|
||||
```
|
||||
|
||||
**Named Volumes** (Docker-managed):
|
||||
```yaml
|
||||
volumes:
|
||||
- loki-data:/loki # Loki storage
|
||||
- postgres-data:/var/lib/postgresql/data
|
||||
```
|
||||
|
||||
### Media Directory Structure
|
||||
|
||||
```
|
||||
/media/
|
||||
├── tv/ # TV shows (Sonarr → Jellyfin)
|
||||
├── movies/ # Movies (Radarr → Jellyfin)
|
||||
├── music/ # Music
|
||||
├── photos/ # Photos (Immich)
|
||||
├── books/ # Ebooks (Calibre-web)
|
||||
├── audiobooks/ # Audiobooks
|
||||
├── comics/ # Comics
|
||||
├── homemovies/ # Home videos
|
||||
├── downloads/ # Active downloads (SABnzbd/qBittorrent)
|
||||
├── complete/ # Completed downloads
|
||||
└── incomplete/ # In-progress downloads
|
||||
```
|
||||
|
||||
### Backup Strategy
|
||||
|
||||
**Important directories to backup:**
|
||||
```
|
||||
compose/core/lldap/data/ # User directory
|
||||
compose/core/traefik/letsencrypt/ # SSL certificates
|
||||
compose/services/*/config/ # Service configurations
|
||||
compose/services/*/data/ # Service data
|
||||
compose/monitoring/logging/loki-data/ # Logs (optional)
|
||||
/media/ # Media library
|
||||
```
|
||||
|
||||
**Excluded from backups:**
|
||||
```
|
||||
compose/services/*/db/ # Databases (backup via dump)
|
||||
compose/monitoring/logging/loki-data/ # Logs (can be recreated)
|
||||
/media/downloads/ # Temporary downloads
|
||||
/media/incomplete/ # Incomplete downloads
|
||||
```
|
||||
|
||||
## 🎮 GPU Acceleration
|
||||
|
||||
### NVIDIA GTX 1070 Configuration
|
||||
|
||||
**GPU Passthrough (Proxmox → VM):**
|
||||
|
||||
1. **Proxmox host** (`/etc/pve/nodes/<node>/qemu-server/<vmid>.conf`):
|
||||
```
|
||||
hostpci0: 0000:01:00,pcie=1,x-vga=1
|
||||
```
|
||||
|
||||
2. **VM (AlmaLinux)** - Install NVIDIA drivers:
|
||||
```bash
|
||||
# Add NVIDIA repository
|
||||
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
|
||||
|
||||
# Install drivers
|
||||
sudo dnf install nvidia-driver nvidia-settings
|
||||
|
||||
# Verify
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
3. **Docker** - Install NVIDIA Container Toolkit:
|
||||
```bash
|
||||
# Add NVIDIA Container Toolkit repo
|
||||
sudo dnf config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
|
||||
|
||||
# Install toolkit
|
||||
sudo dnf install nvidia-container-toolkit
|
||||
|
||||
# Configure Docker
|
||||
sudo nvidia-ctk runtime configure --runtime=docker
|
||||
sudo systemctl restart docker
|
||||
|
||||
# Verify
|
||||
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
|
||||
### Services Using GPU
|
||||
|
||||
**Jellyfin** (Hardware transcoding):
|
||||
```yaml
|
||||
# Uncomment in compose.yaml
|
||||
devices:
|
||||
- /dev/dri:/dev/dri # For NVENC/NVDEC
|
||||
environment:
|
||||
- NVIDIA_VISIBLE_DEVICES=all
|
||||
- NVIDIA_DRIVER_CAPABILITIES=all
|
||||
```
|
||||
|
||||
**Immich** (AI features):
|
||||
```yaml
|
||||
# Already configured
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
**Ollama** (LLM inference):
|
||||
```yaml
|
||||
# Uncomment in compose.yaml
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
### GPU Performance Tuning
|
||||
|
||||
**For Ryzen 5 7600X + GTX 1070:**
|
||||
|
||||
- **Jellyfin**: Can transcode 4-6 simultaneous 4K → 1080p streams
|
||||
- **Ollama**:
|
||||
- 3B models: 40-60 tokens/sec
|
||||
- 7B models: 20-35 tokens/sec
|
||||
- 13B models: 10-15 tokens/sec (quantized)
|
||||
- **Immich**: AI tagging ~5-10 images/sec
|
||||
|
||||
## 🚀 Resource Allocation
|
||||
|
||||
### CPU Allocation (Ryzen 5 7600X - 6C/12T)
|
||||
|
||||
**High Priority** (4-6 cores):
|
||||
- Jellyfin (transcoding)
|
||||
- Sonarr/Radarr (media processing)
|
||||
- Ollama (when running)
|
||||
|
||||
**Medium Priority** (2-4 cores):
|
||||
- Immich (AI processing)
|
||||
- Karakeep (bookmark processing)
|
||||
- SABnzbd/qBittorrent (downloads)
|
||||
|
||||
**Low Priority** (1-2 cores):
|
||||
- Traefik, LLDAP, Tinyauth
|
||||
- Monitoring services
|
||||
- Other utilities
|
||||
|
||||
### RAM Allocation (32GB Total, 24GB VM)
|
||||
|
||||
**Recommended allocation:**
|
||||
|
||||
```
|
||||
Host (Proxmox): 8GB
|
||||
VM Total: 24GB breakdown:
|
||||
├── System: 4GB (AlmaLinux base)
|
||||
├── Docker: 2GB (daemon overhead)
|
||||
├── Jellyfin: 2-4GB (transcoding buffers)
|
||||
├── Immich: 2-3GB (ML models + database)
|
||||
├── Sonarr/Radarr: 1GB each
|
||||
├── Ollama: 4-6GB (when running models)
|
||||
├── Databases: 2-3GB total
|
||||
├── Monitoring: 2GB (Loki + Grafana)
|
||||
└── Other services: 4-5GB
|
||||
```
|
||||
|
||||
### Disk Space Planning
|
||||
|
||||
**System:** 100GB
|
||||
**Docker:** 50GB (images + containers)
|
||||
**Service Data:** 50GB (configs, databases, logs)
|
||||
**Media Library:** Remaining space (expandable)
|
||||
|
||||
**Recommended VM disk:**
|
||||
- Minimum: 500GB (200GB system + 300GB media)
|
||||
- Recommended: 1TB+ (allows room for growth)
|
||||
|
||||
## 🔄 Service Dependencies
|
||||
|
||||
### Startup Order
|
||||
|
||||
**Critical order for initial deployment:**
|
||||
|
||||
1. **Networks**: `docker network create homelab`
|
||||
2. **Core** (must start first):
|
||||
- Traefik (reverse proxy)
|
||||
- LLDAP (user directory)
|
||||
- Tinyauth (SSO provider)
|
||||
3. **Monitoring** (optional but recommended):
|
||||
- Loki + Promtail + Grafana
|
||||
- Uptime Kuma
|
||||
4. **Media Automation**:
|
||||
- Sonarr, Radarr
|
||||
- SABnzbd, qBittorrent
|
||||
- Recyclarr, Profilarr
|
||||
5. **Media Frontend**:
|
||||
- Jellyfin
|
||||
- Jellyseer
|
||||
- Immich
|
||||
6. **Services**:
|
||||
- Karakeep, Ollama (AI features)
|
||||
- Vikunja, Homarr
|
||||
- All other services
|
||||
|
||||
### Service Integration Map
|
||||
|
||||
```
|
||||
Traefik
|
||||
├─→ All services (reverse proxy)
|
||||
└─→ Let's Encrypt (SSL)
|
||||
|
||||
Tinyauth
|
||||
├─→ LLDAP (authentication backend)
|
||||
└─→ All SSO-protected services
|
||||
|
||||
LLDAP
|
||||
└─→ User database for SSO
|
||||
|
||||
Promtail
|
||||
├─→ Docker socket (log collection)
|
||||
└─→ Loki (log forwarding)
|
||||
|
||||
Loki
|
||||
└─→ Grafana (log visualization)
|
||||
|
||||
Karakeep
|
||||
├─→ Ollama (AI tagging)
|
||||
├─→ Meilisearch (search)
|
||||
└─→ Chrome (web archiving)
|
||||
|
||||
Jellyseer
|
||||
├─→ Jellyfin (media info)
|
||||
├─→ Sonarr (TV requests)
|
||||
└─→ Radarr (movie requests)
|
||||
|
||||
Sonarr/Radarr
|
||||
├─→ SABnzbd/qBittorrent (downloads)
|
||||
├─→ Jellyfin (media library)
|
||||
└─→ Recyclarr/Profilarr (quality profiles)
|
||||
|
||||
Homarr
|
||||
└─→ All services (dashboard auto-discovery)
|
||||
```
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### Check Service Health
|
||||
|
||||
```bash
|
||||
# All services status
|
||||
cd ~/homelab
|
||||
docker ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
|
||||
|
||||
# Logs for specific service
|
||||
docker logs <service-name> --tail 100 -f
|
||||
|
||||
# Logs via Loki/Grafana
|
||||
# Go to https://logs.fig.systems
|
||||
# Query: {container="<service-name>"}
|
||||
```
|
||||
|
||||
### Network Issues
|
||||
|
||||
```bash
|
||||
# Check homelab network exists
|
||||
docker network ls | grep homelab
|
||||
|
||||
# Inspect network
|
||||
docker network inspect homelab
|
||||
|
||||
# Test service connectivity
|
||||
docker exec <service-a> ping <service-b>
|
||||
docker exec karakeep curl http://ollama:11434
|
||||
```
|
||||
|
||||
### GPU Not Detected
|
||||
|
||||
```bash
|
||||
# Check GPU in VM
|
||||
nvidia-smi
|
||||
|
||||
# Check Docker can access GPU
|
||||
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
|
||||
|
||||
# Check service GPU allocation
|
||||
docker exec jellyfin nvidia-smi
|
||||
docker exec ollama nvidia-smi
|
||||
```
|
||||
|
||||
### SSL Certificate Issues
|
||||
|
||||
```bash
|
||||
# Check Traefik logs
|
||||
docker logs traefik | grep -i certificate
|
||||
|
||||
# Force certificate renewal
|
||||
docker exec traefik rm -rf /letsencrypt/acme.json
|
||||
docker restart traefik
|
||||
|
||||
# Verify DNS
|
||||
dig +short sonarr.fig.systems
|
||||
```
|
||||
|
||||
### SSO Not Working
|
||||
|
||||
```bash
|
||||
# Check Tinyauth status
|
||||
docker logs tinyauth
|
||||
|
||||
# Check LLDAP connection
|
||||
docker exec tinyauth nc -zv lldap 3890
|
||||
docker exec tinyauth nc -zv lldap 17170
|
||||
|
||||
# Verify credentials match
|
||||
grep LDAP_BIND_PASSWORD compose/core/tinyauth/.env
|
||||
grep LLDAP_LDAP_USER_PASS compose/core/lldap/.env
|
||||
```
|
||||
|
||||
## 📈 Monitoring Best Practices
|
||||
|
||||
### Key Metrics to Monitor
|
||||
|
||||
**System Level:**
|
||||
- CPU usage per container
|
||||
- Memory usage per container
|
||||
- Disk I/O
|
||||
- Network throughput
|
||||
- GPU utilization (for Jellyfin/Ollama/Immich)
|
||||
|
||||
**Application Level:**
|
||||
- Traefik request rate
|
||||
- Failed authentication attempts
|
||||
- Jellyfin concurrent streams
|
||||
- Download speeds (SABnzbd/qBittorrent)
|
||||
- Sonarr/Radarr queue size
|
||||
|
||||
### Uptime Kuma Monitoring
|
||||
|
||||
Configure monitors for:
|
||||
- **HTTP(s)**: All web services (200 status check)
|
||||
- **TCP**: Database ports (PostgreSQL, etc.)
|
||||
- **Docker**: Container health (via Docker socket)
|
||||
- **SSL**: Certificate expiration (30-day warning)
|
||||
|
||||
### Log Monitoring
|
||||
|
||||
Set up Loki alerts for:
|
||||
- ERROR level logs
|
||||
- Authentication failures
|
||||
- Service crashes
|
||||
- Disk space warnings
|
||||
|
||||
## 🔧 Maintenance Tasks
|
||||
|
||||
### Daily
|
||||
- Check Uptime Kuma dashboard
|
||||
- Review any critical alerts
|
||||
|
||||
### Weekly
|
||||
- Check disk space: `df -h`
|
||||
- Review failed downloads in Sonarr/Radarr
|
||||
- Check Loki logs for errors
|
||||
|
||||
### Monthly
|
||||
- Update all containers: `docker compose pull && docker compose up -d`
|
||||
- Review and clean old Docker images: `docker image prune -a`
|
||||
- Backup configurations
|
||||
- Check SSL certificate renewal
|
||||
|
||||
### Quarterly
|
||||
- Review and update documentation
|
||||
- Clean up old media (if needed)
|
||||
- Review and adjust quality profiles
|
||||
- Update Recyclarr configurations
|
||||
|
||||
## 📚 Additional Resources
|
||||
|
||||
- [Traefik Documentation](https://doc.traefik.io/traefik/)
|
||||
- [Docker Compose Best Practices](https://docs.docker.com/compose/production/)
|
||||
- [Loki LogQL Guide](https://grafana.com/docs/loki/latest/logql/)
|
||||
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/)
|
||||
- [Proxmox GPU Passthrough](https://pve.proxmox.com/wiki/PCI_Passthrough)
|
||||
- [AlmaLinux Documentation](https://wiki.almalinux.org/)
|
||||
|
||||
---
|
||||
|
||||
**System Ready!** 🚀
|
||||
775
docs/setup/almalinux-vm.md
Normal file
775
docs/setup/almalinux-vm.md
Normal file
|
|
@ -0,0 +1,775 @@
|
|||
# AlmaLinux 9.6 VM Setup Guide
|
||||
|
||||
Complete setup guide for the homelab VM on AlmaLinux 9.6 running on Proxmox VE 9.
|
||||
|
||||
## Hardware Context
|
||||
|
||||
- **Host**: Proxmox VE 9 (Debian 13 based)
|
||||
- CPU: AMD Ryzen 5 7600X (6C/12T, 5.3 GHz boost)
|
||||
- GPU: NVIDIA GTX 1070 (8GB VRAM)
|
||||
- RAM: 32GB DDR5
|
||||
|
||||
- **VM Allocation**:
|
||||
- OS: AlmaLinux 9.6 (RHEL 9 compatible)
|
||||
- CPU: 8 vCPUs
|
||||
- RAM: 24GB
|
||||
- Disk: 500GB+ (expandable)
|
||||
- GPU: GTX 1070 (PCIe passthrough)
|
||||
|
||||
## Proxmox VM Creation
|
||||
|
||||
### 1. Create VM
|
||||
|
||||
```bash
|
||||
# On Proxmox host
|
||||
qm create 100 \
|
||||
--name homelab \
|
||||
--memory 24576 \
|
||||
--cores 8 \
|
||||
--cpu host \
|
||||
--sockets 1 \
|
||||
--net0 virtio,bridge=vmbr0 \
|
||||
--scsi0 local-lvm:500 \
|
||||
--ostype l26 \
|
||||
--boot order=scsi0
|
||||
|
||||
# Attach AlmaLinux ISO
|
||||
qm set 100 --ide2 local:iso/AlmaLinux-9.6-x86_64-dvd.iso,media=cdrom
|
||||
|
||||
# Enable UEFI
|
||||
qm set 100 --bios ovmf --efidisk0 local-lvm:1
|
||||
```
|
||||
|
||||
### 2. GPU Passthrough
|
||||
|
||||
**Find GPU PCI address:**
|
||||
```bash
|
||||
lspci | grep -i nvidia
|
||||
# Example output: 01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070]
|
||||
```
|
||||
|
||||
**Enable IOMMU in Proxmox:**
|
||||
|
||||
Edit `/etc/default/grub`:
|
||||
```bash
|
||||
# For AMD CPU (Ryzen 5 7600X)
|
||||
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"
|
||||
```
|
||||
|
||||
Update GRUB and reboot:
|
||||
```bash
|
||||
update-grub
|
||||
reboot
|
||||
```
|
||||
|
||||
**Verify IOMMU:**
|
||||
```bash
|
||||
dmesg | grep -e DMAR -e IOMMU
|
||||
# Should show IOMMU enabled
|
||||
```
|
||||
|
||||
**Add GPU to VM:**
|
||||
|
||||
Edit `/etc/pve/qemu-server/100.conf`:
|
||||
```
|
||||
hostpci0: 0000:01:00,pcie=1,x-vga=1
|
||||
```
|
||||
|
||||
Or via command:
|
||||
```bash
|
||||
qm set 100 --hostpci0 0000:01:00,pcie=1,x-vga=1
|
||||
```
|
||||
|
||||
**Blacklist GPU on host:**
|
||||
|
||||
Edit `/etc/modprobe.d/blacklist-nvidia.conf`:
|
||||
```
|
||||
blacklist nouveau
|
||||
blacklist nvidia
|
||||
blacklist nvidia_drm
|
||||
blacklist nvidia_modeset
|
||||
blacklist nvidia_uvm
|
||||
```
|
||||
|
||||
Update initramfs:
|
||||
```bash
|
||||
update-initramfs -u
|
||||
reboot
|
||||
```
|
||||
|
||||
## AlmaLinux Installation
|
||||
|
||||
### 1. Install AlmaLinux 9.6
|
||||
|
||||
Start VM and follow installer:
|
||||
1. **Language**: English (US)
|
||||
2. **Installation Destination**: Use all space, automatic partitioning
|
||||
3. **Network**: Enable and set hostname to `homelab.fig.systems`
|
||||
4. **Software Selection**: Minimal Install
|
||||
5. **Root Password**: Set strong password
|
||||
6. **User Creation**: Create admin user (e.g., `homelab`)
|
||||
|
||||
### 2. Post-Installation Configuration
|
||||
|
||||
```bash
|
||||
# SSH into VM
|
||||
ssh homelab@<vm-ip>
|
||||
|
||||
# Update system
|
||||
sudo dnf update -y
|
||||
|
||||
# Install essential tools
|
||||
sudo dnf install -y \
|
||||
vim \
|
||||
git \
|
||||
curl \
|
||||
wget \
|
||||
htop \
|
||||
ncdu \
|
||||
tree \
|
||||
tmux \
|
||||
bind-utils \
|
||||
net-tools \
|
||||
firewalld
|
||||
|
||||
# Enable and configure firewall
|
||||
sudo systemctl enable --now firewalld
|
||||
sudo firewall-cmd --permanent --add-service=http
|
||||
sudo firewall-cmd --permanent --add-service=https
|
||||
sudo firewall-cmd --reload
|
||||
```
|
||||
|
||||
### 3. Configure Static IP (Optional)
|
||||
|
||||
```bash
|
||||
# Find connection name
|
||||
nmcli connection show
|
||||
|
||||
# Set static IP (example: 192.168.1.100)
|
||||
sudo nmcli connection modify "System eth0" \
|
||||
ipv4.addresses 192.168.1.100/24 \
|
||||
ipv4.gateway 192.168.1.1 \
|
||||
ipv4.dns "1.1.1.1,8.8.8.8" \
|
||||
ipv4.method manual
|
||||
|
||||
# Restart network
|
||||
sudo nmcli connection down "System eth0"
|
||||
sudo nmcli connection up "System eth0"
|
||||
```
|
||||
|
||||
## Docker Installation
|
||||
|
||||
### 1. Install Docker Engine
|
||||
|
||||
```bash
|
||||
# Remove old versions
|
||||
sudo dnf remove docker \
|
||||
docker-client \
|
||||
docker-client-latest \
|
||||
docker-common \
|
||||
docker-latest \
|
||||
docker-latest-logrotate \
|
||||
docker-logrotate \
|
||||
docker-engine
|
||||
|
||||
# Add Docker repository
|
||||
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
|
||||
|
||||
# Install Docker
|
||||
sudo dnf install -y \
|
||||
docker-ce \
|
||||
docker-ce-cli \
|
||||
containerd.io \
|
||||
docker-buildx-plugin \
|
||||
docker-compose-plugin
|
||||
|
||||
# Start Docker
|
||||
sudo systemctl enable --now docker
|
||||
|
||||
# Verify
|
||||
sudo docker run hello-world
|
||||
```
|
||||
|
||||
### 2. Configure Docker
|
||||
|
||||
**Add user to docker group:**
|
||||
```bash
|
||||
sudo usermod -aG docker $USER
|
||||
newgrp docker
|
||||
|
||||
# Verify (no sudo needed)
|
||||
docker ps
|
||||
```
|
||||
|
||||
**Configure Docker daemon:**
|
||||
|
||||
Create `/etc/docker/daemon.json`:
|
||||
```json
|
||||
{
|
||||
"log-driver": "json-file",
|
||||
"log-opts": {
|
||||
"max-size": "10m",
|
||||
"max-file": "3"
|
||||
},
|
||||
"storage-driver": "overlay2",
|
||||
"features": {
|
||||
"buildkit": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Restart Docker:
|
||||
```bash
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
## NVIDIA GPU Setup
|
||||
|
||||
### 1. Install NVIDIA Drivers
|
||||
|
||||
```bash
|
||||
# Add EPEL repository
|
||||
sudo dnf install -y epel-release
|
||||
|
||||
# Add NVIDIA repository
|
||||
sudo dnf config-manager --add-repo \
|
||||
https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
|
||||
|
||||
# Install drivers
|
||||
sudo dnf install -y \
|
||||
nvidia-driver \
|
||||
nvidia-driver-cuda \
|
||||
nvidia-settings \
|
||||
nvidia-persistenced
|
||||
|
||||
# Reboot to load drivers
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
### 2. Verify GPU
|
||||
|
||||
```bash
|
||||
# Check driver version
|
||||
nvidia-smi
|
||||
|
||||
# Expected output:
|
||||
# +-----------------------------------------------------------------------------+
|
||||
# | NVIDIA-SMI 535.xx.xx Driver Version: 535.xx.xx CUDA Version: 12.2 |
|
||||
# |-------------------------------+----------------------+----------------------+
|
||||
# | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
|
||||
# | 0 GeForce GTX 1070 Off | 00000000:01:00.0 Off | N/A |
|
||||
# +-------------------------------+----------------------+----------------------+
|
||||
```
|
||||
|
||||
### 3. Install NVIDIA Container Toolkit
|
||||
|
||||
```bash
|
||||
# Add NVIDIA Container Toolkit repository
|
||||
sudo dnf config-manager --add-repo \
|
||||
https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo
|
||||
|
||||
# Install toolkit
|
||||
sudo dnf install -y nvidia-container-toolkit
|
||||
|
||||
# Configure Docker to use nvidia runtime
|
||||
sudo nvidia-ctk runtime configure --runtime=docker
|
||||
|
||||
# Restart Docker
|
||||
sudo systemctl restart docker
|
||||
|
||||
# Test GPU in container
|
||||
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
|
||||
## Storage Setup
|
||||
|
||||
### 1. Create Media Directory
|
||||
|
||||
```bash
|
||||
# Create media directory structure
|
||||
sudo mkdir -p /media/{tv,movies,music,photos,books,audiobooks,comics,homemovies}
|
||||
sudo mkdir -p /media/{downloads,complete,incomplete}
|
||||
|
||||
# Set ownership
|
||||
sudo chown -R $USER:$USER /media
|
||||
|
||||
# Set permissions
|
||||
chmod -R 755 /media
|
||||
```
|
||||
|
||||
### 2. Mount Additional Storage (Optional)
|
||||
|
||||
If using separate disk for media:
|
||||
|
||||
```bash
|
||||
# Find disk
|
||||
lsblk
|
||||
|
||||
# Format disk (example: /dev/sdb)
|
||||
sudo mkfs.ext4 /dev/sdb
|
||||
|
||||
# Get UUID
|
||||
sudo blkid /dev/sdb
|
||||
|
||||
# Add to /etc/fstab
|
||||
echo "UUID=<uuid> /media ext4 defaults,nofail 0 2" | sudo tee -a /etc/fstab
|
||||
|
||||
# Mount
|
||||
sudo mount -a
|
||||
```
|
||||
|
||||
## Homelab Repository Setup
|
||||
|
||||
### 1. Clone Repository
|
||||
|
||||
```bash
|
||||
# Create workspace
|
||||
mkdir -p ~/homelab
|
||||
cd ~/homelab
|
||||
|
||||
# Clone repository
|
||||
git clone https://github.com/efigueroa/homelab.git .
|
||||
|
||||
# Or if using SSH
|
||||
git clone git@github.com:efigueroa/homelab.git .
|
||||
```
|
||||
|
||||
### 2. Create Docker Network
|
||||
|
||||
```bash
|
||||
# Create homelab network
|
||||
docker network create homelab
|
||||
|
||||
# Verify
|
||||
docker network ls | grep homelab
|
||||
```
|
||||
|
||||
### 3. Configure Environment Variables
|
||||
|
||||
```bash
|
||||
# Generate secrets for all services
|
||||
cd ~/homelab
|
||||
|
||||
# LLDAP
|
||||
cd compose/core/lldap
|
||||
openssl rand -hex 32 > /tmp/lldap_jwt_secret
|
||||
openssl rand -base64 32 | tr -d /=+ | cut -c1-32 > /tmp/lldap_pass
|
||||
# Update .env with generated secrets
|
||||
|
||||
# Tinyauth
|
||||
cd ../tinyauth
|
||||
openssl rand -hex 32 > /tmp/tinyauth_session
|
||||
# Update .env (LDAP_BIND_PASSWORD must match LLDAP)
|
||||
|
||||
# Continue for all services...
|
||||
```
|
||||
|
||||
See [`docs/guides/secrets-management.md`](../guides/secrets-management.md) for complete guide.
|
||||
|
||||
## SELinux Configuration
|
||||
|
||||
AlmaLinux uses SELinux by default. Configure for Docker:
|
||||
|
||||
```bash
|
||||
# Check SELinux status
|
||||
getenforce
|
||||
# Should show: Enforcing
|
||||
|
||||
# Allow Docker to access bind mounts
|
||||
sudo setsebool -P container_manage_cgroup on
|
||||
|
||||
# If you encounter permission issues:
|
||||
# Option 1: Add SELinux context to directories
|
||||
sudo chcon -R -t container_file_t ~/homelab/compose
|
||||
sudo chcon -R -t container_file_t /media
|
||||
|
||||
# Option 2: Use :Z flag in docker volumes (auto-relabels)
|
||||
# Example: ./data:/data:Z
|
||||
|
||||
# Option 3: Set SELinux to permissive (not recommended)
|
||||
# sudo setenforce 0
|
||||
```
|
||||
|
||||
## System Tuning
|
||||
|
||||
### 1. Increase File Limits
|
||||
|
||||
```bash
|
||||
# Add to /etc/security/limits.conf
|
||||
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
|
||||
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf
|
||||
|
||||
# Add to /etc/sysctl.conf
|
||||
echo "fs.file-max = 65536" | sudo tee -a /etc/sysctl.conf
|
||||
echo "fs.inotify.max_user_watches = 524288" | sudo tee -a /etc/sysctl.conf
|
||||
|
||||
# Apply
|
||||
sudo sysctl -p
|
||||
```
|
||||
|
||||
### 2. Optimize for Media Server
|
||||
|
||||
```bash
|
||||
# Network tuning
|
||||
echo "net.core.rmem_max = 134217728" | sudo tee -a /etc/sysctl.conf
|
||||
echo "net.core.wmem_max = 134217728" | sudo tee -a /etc/sysctl.conf
|
||||
echo "net.ipv4.tcp_rmem = 4096 87380 67108864" | sudo tee -a /etc/sysctl.conf
|
||||
echo "net.ipv4.tcp_wmem = 4096 65536 67108864" | sudo tee -a /etc/sysctl.conf
|
||||
|
||||
# Apply
|
||||
sudo sysctl -p
|
||||
```
|
||||
|
||||
### 3. CPU Governor (Ryzen 5 7600X)
|
||||
|
||||
```bash
|
||||
# Install cpupower
|
||||
sudo dnf install -y kernel-tools
|
||||
|
||||
# Set to performance mode
|
||||
sudo cpupower frequency-set -g performance
|
||||
|
||||
# Make permanent
|
||||
echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
### 1. Deploy Core Services
|
||||
|
||||
```bash
|
||||
cd ~/homelab
|
||||
|
||||
# Create network
|
||||
docker network create homelab
|
||||
|
||||
# Deploy Traefik
|
||||
cd compose/core/traefik
|
||||
docker compose up -d
|
||||
|
||||
# Deploy LLDAP
|
||||
cd ../lldap
|
||||
docker compose up -d
|
||||
|
||||
# Wait for LLDAP to be ready (30 seconds)
|
||||
sleep 30
|
||||
|
||||
# Deploy Tinyauth
|
||||
cd ../tinyauth
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### 2. Configure LLDAP
|
||||
|
||||
```bash
|
||||
# Access LLDAP web UI
|
||||
# https://lldap.fig.systems
|
||||
|
||||
# 1. Login with admin credentials from .env
|
||||
# 2. Create observer user for tinyauth
|
||||
# 3. Create regular users
|
||||
```
|
||||
|
||||
### 3. Deploy Monitoring
|
||||
|
||||
```bash
|
||||
cd ~/homelab
|
||||
|
||||
# Deploy logging stack
|
||||
cd compose/monitoring/logging
|
||||
docker compose up -d
|
||||
|
||||
# Deploy uptime monitoring
|
||||
cd ../uptime
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### 4. Deploy Services
|
||||
|
||||
See [`README.md`](../../README.md) for complete deployment order.
|
||||
|
||||
## Verification
|
||||
|
||||
### 1. Check All Services
|
||||
|
||||
```bash
|
||||
# List all running containers
|
||||
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
|
||||
|
||||
# Check networks
|
||||
docker network ls
|
||||
|
||||
# Check volumes
|
||||
docker volume ls
|
||||
```
|
||||
|
||||
### 2. Test GPU Access
|
||||
|
||||
```bash
|
||||
# Test in Jellyfin
|
||||
docker exec jellyfin nvidia-smi
|
||||
|
||||
# Test in Ollama
|
||||
docker exec ollama nvidia-smi
|
||||
|
||||
# Test in Immich
|
||||
docker exec immich-machine-learning nvidia-smi
|
||||
```
|
||||
|
||||
### 3. Test Logging
|
||||
|
||||
```bash
|
||||
# Check Promtail is collecting logs
|
||||
docker logs promtail | grep "clients configured"
|
||||
|
||||
# Access Grafana
|
||||
# https://logs.fig.systems
|
||||
|
||||
# Query logs
|
||||
# {container="traefik"}
|
||||
```
|
||||
|
||||
### 4. Test SSL
|
||||
|
||||
```bash
|
||||
# Check certificate
|
||||
curl -vI https://sonarr.fig.systems 2>&1 | grep -i "subject:"
|
||||
|
||||
# Should show valid Let's Encrypt certificate
|
||||
```
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### 1. VM Snapshots (Proxmox)
|
||||
|
||||
```bash
|
||||
# On Proxmox host
|
||||
# Create snapshot before major changes
|
||||
qm snapshot 100 pre-update-$(date +%Y%m%d)
|
||||
|
||||
# List snapshots
|
||||
qm listsnapshot 100
|
||||
|
||||
# Restore snapshot
|
||||
qm rollback 100 <snapshot-name>
|
||||
```
|
||||
|
||||
### 2. Configuration Backup
|
||||
|
||||
```bash
|
||||
# On VM
|
||||
cd ~/homelab
|
||||
|
||||
# Backup all configs (excludes data directories)
|
||||
tar czf homelab-config-$(date +%Y%m%d).tar.gz \
|
||||
--exclude='*/data' \
|
||||
--exclude='*/db' \
|
||||
--exclude='*/pgdata' \
|
||||
--exclude='*/config' \
|
||||
--exclude='*/models' \
|
||||
--exclude='*_data' \
|
||||
compose/
|
||||
|
||||
# Backup to external storage
|
||||
scp homelab-config-*.tar.gz user@backup-server:/backups/
|
||||
```
|
||||
|
||||
### 3. Automated Backups with Backrest
|
||||
|
||||
Backrest service is included and configured. See:
|
||||
- `compose/services/backrest/`
|
||||
- Access: https://backup.fig.systems
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Weekly
|
||||
|
||||
```bash
|
||||
# Update containers
|
||||
cd ~/homelab
|
||||
find compose -name "compose.yaml" -type f | while read compose; do
|
||||
dir=$(dirname "$compose")
|
||||
echo "Updating $dir"
|
||||
cd "$dir"
|
||||
docker compose pull
|
||||
docker compose up -d
|
||||
cd ~/homelab
|
||||
done
|
||||
|
||||
# Clean up old images
|
||||
docker image prune -a -f
|
||||
|
||||
# Check disk space
|
||||
df -h
|
||||
ncdu /media
|
||||
```
|
||||
|
||||
### Monthly
|
||||
|
||||
```bash
|
||||
# Update AlmaLinux
|
||||
sudo dnf update -y
|
||||
|
||||
# Update NVIDIA drivers (if available)
|
||||
sudo dnf update nvidia-driver* -y
|
||||
|
||||
# Reboot if kernel updated
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Services Won't Start
|
||||
|
||||
```bash
|
||||
# Check SELinux denials
|
||||
sudo ausearch -m avc -ts recent
|
||||
|
||||
# If SELinux is blocking:
|
||||
sudo setsebool -P container_manage_cgroup on
|
||||
|
||||
# Or relabel directories
|
||||
sudo restorecon -Rv ~/homelab/compose
|
||||
```
|
||||
|
||||
### GPU Not Detected
|
||||
|
||||
```bash
|
||||
# Check GPU is passed through
|
||||
lspci | grep -i nvidia
|
||||
|
||||
# Check drivers loaded
|
||||
lsmod | grep nvidia
|
||||
|
||||
# Reinstall drivers
|
||||
sudo dnf reinstall nvidia-driver* -y
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
### Network Issues
|
||||
|
||||
```bash
|
||||
# Check firewall
|
||||
sudo firewall-cmd --list-all
|
||||
|
||||
# Add ports if needed
|
||||
sudo firewall-cmd --permanent --add-port=80/tcp
|
||||
sudo firewall-cmd --permanent --add-port=443/tcp
|
||||
sudo firewall-cmd --reload
|
||||
|
||||
# Check Docker network
|
||||
docker network inspect homelab
|
||||
```
|
||||
|
||||
### Permission Denied Errors
|
||||
|
||||
```bash
|
||||
# Check ownership
|
||||
ls -la ~/homelab/compose/*/
|
||||
|
||||
# Fix ownership
|
||||
sudo chown -R $USER:$USER ~/homelab
|
||||
|
||||
# Check SELinux context
|
||||
ls -Z ~/homelab/compose
|
||||
|
||||
# Fix SELinux labels
|
||||
sudo chcon -R -t container_file_t ~/homelab/compose
|
||||
```
|
||||
|
||||
## Performance Monitoring
|
||||
|
||||
### System Stats
|
||||
|
||||
```bash
|
||||
# CPU usage
|
||||
htop
|
||||
|
||||
# GPU usage
|
||||
watch -n 1 nvidia-smi
|
||||
|
||||
# Disk I/O
|
||||
iostat -x 1
|
||||
|
||||
# Network
|
||||
iftop
|
||||
|
||||
# Per-container stats
|
||||
docker stats
|
||||
```
|
||||
|
||||
### Resource Limits
|
||||
|
||||
Example container resource limits:
|
||||
|
||||
```yaml
|
||||
# In compose.yaml
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '2.0'
|
||||
memory: 4G
|
||||
reservations:
|
||||
cpus: '1.0'
|
||||
memory: 2G
|
||||
```
|
||||
|
||||
## Security Hardening
|
||||
|
||||
### 1. Disable Root SSH
|
||||
|
||||
```bash
|
||||
# Edit /etc/ssh/sshd_config
|
||||
sudo sed -i 's/#PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config
|
||||
|
||||
# Restart SSH
|
||||
sudo systemctl restart sshd
|
||||
```
|
||||
|
||||
### 2. Configure Fail2Ban
|
||||
|
||||
```bash
|
||||
# Install
|
||||
sudo dnf install -y fail2ban
|
||||
|
||||
# Configure
|
||||
sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
|
||||
|
||||
# Edit /etc/fail2ban/jail.local
|
||||
# [sshd]
|
||||
# enabled = true
|
||||
# maxretry = 3
|
||||
# bantime = 3600
|
||||
|
||||
# Start
|
||||
sudo systemctl enable --now fail2ban
|
||||
```
|
||||
|
||||
### 3. Automatic Updates
|
||||
|
||||
```bash
|
||||
# Install dnf-automatic
|
||||
sudo dnf install -y dnf-automatic
|
||||
|
||||
# Configure /etc/dnf/automatic.conf
|
||||
# apply_updates = yes
|
||||
|
||||
# Enable
|
||||
sudo systemctl enable --now dnf-automatic.timer
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ✅ VM created and AlmaLinux installed
|
||||
2. ✅ Docker and NVIDIA drivers configured
|
||||
3. ✅ Homelab repository cloned
|
||||
4. ✅ Network and storage configured
|
||||
5. ⬜ Deploy core services
|
||||
6. ⬜ Configure SSO
|
||||
7. ⬜ Deploy all services
|
||||
8. ⬜ Configure backups
|
||||
9. ⬜ Set up monitoring
|
||||
|
||||
---
|
||||
|
||||
**System ready for deployment!** 🚀
|
||||
Loading…
Reference in a new issue