History

Eduardo Figueroa d9c424e4fc refactor(monitoring): Remove Tinyauth middleware from monitoring services Remove Tinyauth SSO middleware from Loki and Uptime Kuma. These services will migrate to Authelia for authentication.		2025-12-12 23:17:26 +00:00
..
.env	feat: Add Uptime Kuma for service uptime and status monitoring	2025-11-09 01:21:14 +00:00
.gitignore	feat: Add Uptime Kuma for service uptime and status monitoring	2025-11-09 01:21:14 +00:00
compose.yaml	refactor(monitoring): Remove Tinyauth middleware from monitoring services	2025-12-12 23:17:26 +00:00
README.md	feat: Add Uptime Kuma for service uptime and status monitoring	2025-11-09 01:21:14 +00:00

README.md

Uptime Kuma - Status & Uptime Monitoring

Beautiful uptime monitoring and alerting for all your homelab services.

Overview

Uptime Kuma monitors the health and uptime of your services:

✅ HTTP(s) Monitoring: Check if web services are responding
✅ TCP Port Monitoring: Check if services are listening on ports
✅ Docker Container Monitoring: Check container status
✅ Response Time: Measure how fast services respond
✅ SSL Certificate Monitoring: Alert before certificates expire
✅ Status Pages: Public or private status pages
✅ Notifications: Email, Discord, Slack, Pushover, and 90+ more
✅ Beautiful UI: Clean, modern interface

Quick Start

1. Deploy

cd ~/homelab/compose/monitoring/uptime
docker compose up -d

2. Access Web UI

Go to: https://status.fig.systems

3. Create Admin Account

On first visit, you'll be prompted to create an admin account:

Username: admin (or your choice)
Password: Strong password
Click "Create"

4. Add Your First Monitor

Click "Add New Monitor"

Example: Monitor Jellyfin

Monitor Type: HTTP(s)
Friendly Name: Jellyfin
URL: https://flix.fig.systems
Heartbeat Interval: 60 seconds
Retries: 3
Click Save

Uptime Kuma will now check Jellyfin every 60 seconds!

Monitoring Your Services

Quick Setup All Services

Here's a template for all your homelab services:

Core Services:

Name: Traefik Dashboard
Type: HTTP(s)
URL: https://traefik.fig.systems
Interval: 60s

Name: LLDAP
Type: HTTP(s)
URL: https://lldap.fig.systems
Interval: 60s

Name: Grafana Logs
Type: HTTP(s)
URL: https://logs.fig.systems
Interval: 60s

Media Services:

Name: Jellyfin
Type: HTTP(s)
URL: https://flix.fig.systems
Interval: 60s

Name: Immich
Type: HTTP(s)
URL: https://photos.fig.systems
Interval: 60s

Name: Jellyseerr
Type: HTTP(s)
URL: https://requests.fig.systems
Interval: 60s

Name: Sonarr
Type: HTTP(s)
URL: https://sonarr.fig.systems
Interval: 60s

Name: Radarr
Type: HTTP(s)
URL: https://radarr.fig.systems
Interval: 60s

Utility Services:

Name: Homarr Dashboard
Type: HTTP(s)
URL: https://home.fig.systems
Interval: 60s

Name: Backrest
Type: HTTP(s)
URL: https://backup.fig.systems
Interval: 60s

Name: Linkwarden
Type: HTTP(s)
URL: https://links.fig.systems
Interval: 60s

Name: Vikunja
Type: HTTP(s)
URL: https://tasks.fig.systems
Interval: 60s

Advanced Monitoring Options

Monitor Docker Containers Directly

Setup:

Add New Monitor
Type: Docker Container
Docker Daemon: unix:///var/run/docker.sock
Container Name: jellyfin
Click Save

Benefits:

Checks if container is running
Monitors container restarts
No network requests needed

Note: Requires mounting Docker socket (already configured).

Monitor TCP Ports

Example: Monitor PostgreSQL

Type: TCP Port
Hostname: linkwarden-postgres
Port: 5432
Interval: 60s

Check SSL Certificates

Automatic: When using HTTP(s) monitors, Uptime Kuma automatically:

Checks SSL certificate validity
Alerts when certificate expires soon (7 days default)
Shows certificate expiry date

Keyword Monitoring

Check if a page contains specific text:

Type: HTTP(s) - Keyword
URL: https://home.fig.systems
Keyword: "Homarr"  # Check page contains "Homarr"

Notifications

Setup Alerts

Click Settings (gear icon)
Click Notifications
Click Setup Notification

Popular Options

Email

Type: Email (SMTP)
Host: smtp.gmail.com
Port: 587
Security: TLS
Username: your-email@gmail.com
Password: your-app-password
From: alerts@yourdomain.com
To: you@email.com

Discord

Type: Discord
Webhook URL: https://discord.com/api/webhooks/...
(Get from Discord Server Settings → Integrations → Webhooks)

Slack

Type: Slack
Webhook URL: https://hooks.slack.com/services/...
(Get from Slack App → Incoming Webhooks)

Pushover (Mobile)

Type: Pushover
User Key: (from Pushover account)
App Token: (create app in Pushover)
Priority: Normal

Gotify (Self-hosted)

Type: Gotify
Server URL: https://gotify.yourdomain.com
App Token: (from Gotify)
Priority: 5

Apply to Monitors

After setting up notification:

Edit a monitor
Scroll to Notifications
Select your notification method
Click Save

Or apply to all monitors:

Settings → Notifications
Click Apply on all existing monitors

Status Pages

Create Public Status Page

Perfect for showing service status to family/friends!

Setup:

Click Status Pages
Click Add New Status Page
Slug: homelab (creates /status/homelab)
Title: Homelab Status
Description: Status of all homelab services
Click Next

Add Services:

Drag monitors into "Public" or "Groups"
Organize by category (Core, Media, Utilities)
Click Save

Access:

Private: https://status.fig.systems/status/homelab
Or make public (no login required)

Share with family:

https://status.fig.systems/status/homelab

Customize Status Page

Options:

Show/hide uptime percentage
Show/hide response time
Custom domain
Theme (light/dark/auto)
Custom CSS
Password protection

Tags and Groups

Organize Monitors with Tags

Create Tags:

Click Manage Tags
Add tags like:
- core
- media
- critical
- production

Apply to Monitors:

Edit monitor
Scroll to Tags
Select tags
Save

Filter by Tag:

Click tag name to show only those monitors

Create Monitor Groups

Group by service type:

Settings → Groups
Create groups:
- Core Infrastructure
- Media Services
- Productivity
- Monitoring

Drag monitors into groups for organization.

Maintenance Windows

Schedule Maintenance

Pause notifications during planned downtime:

Edit monitor
Click Maintenance
Add Maintenance
Set start/end time
Select monitors
Save

During maintenance:

Monitor still checks but doesn't alert
Status page shows "In Maintenance"

Best Practices

Monitor Configuration

Heartbeat Interval:

Critical services: 30-60 seconds
Normal services: 60-120 seconds
Background jobs: 300-600 seconds

Retries:

Set to 2-3 to avoid false positives
Service must fail 2-3 times before alerting

Timeout:

Web services: 10-30 seconds
APIs: 5-10 seconds
Slow services: 30-60 seconds

What to Monitor

Critical (Monitor these!):

✅ Traefik (if this is down, everything is down)
✅ LLDAP (SSO depends on this)
✅ Core services users depend on

Important:

✅ Jellyfin, Immich (main media services)
✅ Sonarr, Radarr (automation)
✅ Backrest (backups)

Nice to have:

⬜ Utility services
⬜ Less critical services

Don't over-monitor:

Internal components (databases, redis, etc.)
These should be monitored via main service health

Notification Strategy

Alert fatigue is real!

Good approach:

Critical services → Immediate push notification
Important services → Email
Nice-to-have → Email digest

Don't:

Alert on every blip
Send all alerts to mobile push
Alert on expected downtime

Integration with Loki

Uptime Kuma and Loki complement each other:

Uptime Kuma:

✅ Is the service UP or DOWN?
✅ How long was it down?
✅ Response time trends

Loki:

✅ WHY did it go down?
✅ What errors happened?
✅ Historical log analysis

Workflow:

Uptime Kuma alerts you: "Jellyfin is down!"
Go to Grafana/Loki
Query: {container="jellyfin"} | __timestamp__ >= now() - 15m
See what went wrong

Metrics and Graphs

Built-in Metrics

Uptime Kuma tracks:

Uptime %: 99.9%, 99.5%, etc.
Response Time: Average, min, max
Ping: Latency to service
Certificate Expiry: Days until SSL expires

Response Time Graph

Click any monitor to see:

24-hour response time graph
Uptime/downtime periods
Recent incidents

Export Data

Export uptime data:

Settings → Backup
Export JSON (includes all monitors and data)
Store backup safely

Troubleshooting

Monitor Shows Down But Service Works

Check:

SSL Certificate: Is it valid?
SSO: Does monitor need to login first?
Timeout: Is timeout too short?
Network: Can Uptime Kuma reach the service?

Solutions:

Increase timeout
Check accepted status codes (200-299)
Verify URL is correct
Check Uptime Kuma logs: docker logs uptime-kuma

Docker Container Monitor Not Working

Requirements:

Docker socket must be mounted (✅ already configured)
Container name must be exact

Test:

docker exec uptime-kuma ls /var/run/docker.sock
# Should show the socket file

Notifications Not Sending

Check:

Test notification in Settings → Notifications
Check Uptime Kuma logs
Verify notification service credentials
Check if notification is enabled on monitor

Can't Access Web UI

Check:

# Container running?
docker ps | grep uptime-kuma

# Logs
docker logs uptime-kuma

# Traefik routing
docker logs traefik | grep uptime

Advanced Features

API Access

Uptime Kuma has a WebSocket API:

Get API Key:

Settings → API Keys
Generate new key
Use with monitoring tools

Docker Socket Monitoring

Already configured! You can monitor:

Container status (running/stopped)
Container restarts
Resource usage (via Docker stats)

Multiple Status Pages

Create different status pages:

/status/public - For family/friends
/status/critical - Only critical services
/status/media - Media services only

Custom CSS

Brand your status page:

Status Page → Edit
Custom CSS
Add styling

Example:

body {
  background: #1a1a1a;
}
.title {
  color: #00ff00;
}

Resource Usage

Typical usage:

RAM: 50-150MB
CPU: Very low (only during checks)
Disk: <100MB
Network: Minimal (only during checks)

Very lightweight!

Backup and Restore

Backup

Automatic backup:

Settings → Backup
Export

Manual backup:

cd ~/homelab/compose/monitoring/uptime
tar czf uptime-backup-$(date +%Y%m%d).tar.gz ./data

Restore

docker compose down
tar xzf uptime-backup-YYYYMMDD.tar.gz
docker compose up -d

Comparison: Uptime Kuma vs Loki

Feature	Uptime Kuma	Loki
Purpose	Uptime monitoring	Log aggregation
Checks	HTTP, TCP, Ping, Docker	Logs only
Alerts	Service down, slow	Log patterns
Response Time	✅ Yes	❌ No
Uptime %	✅ Yes	❌ No
SSL Monitoring	✅ Yes	❌ No
Why Service Down	❌ No	✅ Yes (via logs)
Historical Logs	❌ No	✅ Yes
Status Pages	✅ Yes	❌ No

Use both together!

Uptime Kuma tells you WHAT is down
Loki tells you WHY it went down

Next Steps

✅ Deploy Uptime Kuma
✅ Add monitors for all services
✅ Set up notifications (Email, Discord, etc.)
✅ Create status page
✅ Test alerts by stopping a service
⬜ Share status page with family
⬜ Set up maintenance windows
⬜ Review and tune check intervals

Resources

Know instantly when something goes down! 🚨