Add comprehensive Grafana dashboard for viewing Docker container logs via Loki datasource. Dashboard features: - Real-time log streaming from all containers - Log volume visualization by container - Error detection and tracking - Container and image filtering - Text search with regex support - Statistics panels (active containers, total logs, error count, log rate) Includes: - Pre-configured template variables for dynamic filtering - Auto-refresh every 10 seconds - Complete documentation with LogQL examples - Troubleshooting guide |
||
|---|---|---|
| .. | ||
| grafana-provisioning | ||
| .env | ||
| .env.example | ||
| .gitignore | ||
| compose.yaml | ||
| DOCKER-LOGS-DASHBOARD.md | ||
| loki-config.yaml | ||
| promtail-config.yaml | ||
| README.md | ||
Centralized Logging Stack
Grafana Loki + Promtail + Grafana for centralized Docker container log aggregation and visualization.
Overview
This stack provides centralized logging for all Docker containers in your homelab:
- Loki: Log aggregation backend (like Prometheus but for logs)
- Promtail: Agent that collects logs from Docker containers
- Grafana: Web UI for querying and visualizing logs
Why This Stack?
- ✅ Lightweight: Minimal resource usage compared to ELK stack
- ✅ Docker-native: Automatically discovers and collects logs from all containers
- ✅ Powerful search: LogQL query language for filtering and searching
- ✅ Retention: Configurable log retention (default: 30 days)
- ✅ Labels: Automatic labeling by container, image, compose project
- ✅ Integrated: Works seamlessly with existing homelab services
Quick Start
1. Configure Environment
cd ~/homelab/compose/monitoring/logging
nano .env
Update:
# Change this!
GF_SECURITY_ADMIN_PASSWORD=<your-strong-password>
2. Deploy the Stack
docker compose up -d
3. Access Grafana
Go to: https://logs.fig.systems
Default credentials:
- Username:
admin - Password:
<your GF_SECURITY_ADMIN_PASSWORD>
⚠️ Change the password immediately after first login!
4. View Logs
- Click "Explore" (compass icon) in left sidebar
- Select "Loki" datasource (should be selected by default)
- Start querying logs!
Usage
Basic Log Queries
View all logs from a container:
{container="jellyfin"}
View logs from a compose project:
{compose_project="media"}
View logs from specific service:
{compose_service="lldap"}
Filter by log level:
{container="immich_server"} |= "error"
Exclude lines:
{container="traefik"} != "404"
Multiple filters:
{container="jellyfin"} |= "error" != "404"
Advanced Queries
Count errors per minute:
sum(count_over_time({container="jellyfin"} |= "error" [1m])) by (container)
Rate of logs:
rate({container="traefik"}[5m])
Logs from last hour:
{container="immich_server"} | __timestamp__ >= now() - 1h
Filter by multiple containers:
{container=~"jellyfin|immich.*|sonarr"}
Extract and filter JSON:
{container="linkwarden"} | json | level="error"
Configuration
Log Retention
Default: 30 days
To change retention period:
Edit .env:
LOKI_RETENTION_PERIOD=60d # Keep logs for 60 days
Edit loki-config.yaml:
limits_config:
retention_period: 60d # Must match .env
table_manager:
retention_period: 60d # Must match above
Restart:
docker compose restart loki
Adjust Resource Limits
Edit loki-config.yaml:
limits_config:
ingestion_rate_mb: 10 # MB/sec per stream
ingestion_burst_size_mb: 20 # Burst size
Add Custom Labels
Edit promtail-config.yaml:
scrape_configs:
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
relabel_configs:
# Add custom label
- source_labels: ['__meta_docker_container_label_environment']
target_label: 'environment'
How It Works
Architecture
Docker Containers
↓ (logs via Docker socket)
Promtail (scrapes and ships)
↓ (HTTP push)
Loki (stores and indexes)
↓ (LogQL queries)
Grafana (visualization)
Log Collection
Promtail automatically collects logs from:
- All Docker containers via Docker socket
- System logs from
/var/log
Logs are labeled with:
container: Container nameimage: Docker imagecompose_project: Docker Compose project namecompose_service: Service name from compose.yamlstream: stdout or stderr
Storage
Logs are stored in:
- Location:
./loki-data/ - Format: Compressed chunks
- Index: BoltDB
- Retention: Automatic cleanup after retention period
Integration with Services
Option 1: Automatic (Default)
Promtail automatically discovers all containers. No changes needed!
Option 2: Explicit Labels (Recommended)
Add labels to services for better organization:
Edit any service's compose.yaml:
services:
servicename:
# ... existing config ...
labels:
# ... existing labels ...
# Add logging labels
logging: "promtail"
log_level: "info"
environment: "production"
These labels will be available in Loki for filtering.
Option 3: Send Logs Directly to Loki
Instead of Promtail scraping, send logs directly:
Edit service compose.yaml:
services:
servicename:
# ... existing config ...
logging:
driver: loki
options:
loki-url: "http://loki:3100/loki/api/v1/push"
loki-external-labels: "container={{.Name}},compose_project={{.Config.Labels[\"com.docker.compose.project\"]}}"
Note: This requires the Loki Docker driver plugin (not recommended for simplicity).
Grafana Dashboards
Built-in Explore
Best way to start - use Grafana's Explore view:
- Click "Explore" icon (compass)
- Select "Loki" datasource
- Use builder to create queries
- Save interesting queries
Pre-built Dashboards
You can import community dashboards:
- Go to Dashboards → Import
- Use dashboard ID:
13639(Docker logs dashboard) - Select "Loki" as datasource
- Import
Create Custom Dashboard
- Click "+" → "Dashboard"
- Add panel
- Select Loki datasource
- Build query using LogQL
- Save dashboard
Example panels:
- Error count by container
- Log volume over time
- Top 10 logging containers
- Recent errors table
Alerting
Create Log-Based Alerts
- Go to Alerting → Alert rules
- Create new alert rule
- Query:
sum(count_over_time({container="jellyfin"} |= "error" [5m])) > 10 - Set thresholds and notification channels
- Save
Example alerts:
- Too many errors in container
- Container restarted
- Disk space warnings
- Failed authentication attempts
Troubleshooting
Promtail Not Collecting Logs
Check Promtail is running:
docker logs promtail
Verify Docker socket access:
docker exec promtail ls -la /var/run/docker.sock
Test Promtail config:
docker exec promtail promtail -config.file=/etc/promtail/config.yaml -dry-run
Loki Not Receiving Logs
Check Loki health:
curl http://localhost:3100/ready
View Loki logs:
docker logs loki
Check Promtail is pushing:
docker logs promtail | grep -i push
Grafana Can't Connect to Loki
Test Loki from Grafana container:
docker exec grafana wget -O- http://loki:3100/ready
Check datasource configuration:
- Grafana → Configuration → Data sources → Loki
- URL should be:
http://loki:3100
No Logs Appearing
Wait a few minutes - logs take time to appear
Check retention:
# Logs older than retention period are deleted
grep retention_period loki-config.yaml
Verify time range in Grafana:
- Make sure selected time range includes recent logs
- Try "Last 5 minutes"
High Disk Usage
Check Loki data size:
du -sh ./loki-data
Reduce retention:
LOKI_RETENTION_PERIOD=7d # Shorter retention
Manual cleanup:
# Stop Loki
docker compose stop loki
# Remove old data (CAREFUL!)
rm -rf ./loki-data/chunks/*
# Restart
docker compose start loki
Performance Tuning
For Low Resources (< 8GB RAM)
Edit loki-config.yaml:
limits_config:
retention_period: 7d # Shorter retention
ingestion_rate_mb: 5 # Lower rate
ingestion_burst_size_mb: 10 # Lower burst
query_range:
results_cache:
cache:
embedded_cache:
max_size_mb: 50 # Smaller cache
For High Volume
Edit loki-config.yaml:
limits_config:
ingestion_rate_mb: 20 # Higher rate
ingestion_burst_size_mb: 40 # Higher burst
query_range:
results_cache:
cache:
embedded_cache:
max_size_mb: 200 # Larger cache
Best Practices
Log Levels
Configure services to log appropriately:
- Production:
infoorwarning - Development:
debug - Troubleshooting:
trace
Too much logging = higher resource usage!
Retention Strategy
- Critical services: 60+ days
- Normal services: 30 days
- High volume services: 7-14 days
Query Optimization
- Use specific labels:
{container="name"}not{container=~".*"} - Limit time range: Query hours not days when possible
- Use filters early:
|= "error"before parsing - Avoid regex when possible:
|= "string"faster than|~ "reg.*ex"
Storage Management
Monitor disk usage:
# Check regularly
du -sh compose/monitoring/logging/loki-data
# Set up alerts when > 80% disk usage
Integration with Homarr
Grafana will automatically appear in Homarr dashboard. You can also:
Add Grafana Widget to Homarr
- Edit Homarr dashboard
- Add "iFrame" widget
- URL:
https://logs.fig.systems/d/<dashboard-id> - This embeds Grafana dashboards in Homarr
Backup and Restore
Backup
# Backup Loki data
tar czf loki-backup-$(date +%Y%m%d).tar.gz ./loki-data
# Backup Grafana dashboards and datasources
tar czf grafana-backup-$(date +%Y%m%d).tar.gz ./grafana-data ./grafana-provisioning
Restore
# Restore Loki
docker compose down
tar xzf loki-backup-YYYYMMDD.tar.gz
docker compose up -d
# Restore Grafana
docker compose down
tar xzf grafana-backup-YYYYMMDD.tar.gz
docker compose up -d
Updating
cd ~/homelab/compose/monitoring/logging
# Pull latest images
docker compose pull
# Restart with new images
docker compose up -d
Resource Usage
Typical usage:
- Loki: 200-500MB RAM
- Promtail: 50-100MB RAM
- Grafana: 100-200MB RAM
- Disk: ~1-5GB per week (depends on log volume)
Next Steps
- ✅ Deploy the stack
- ✅ Login to Grafana and explore logs
- ✅ Create useful dashboards
- ✅ Set up alerts for errors
- ✅ Configure retention based on needs
- ⬜ Add Prometheus for metrics (future)
- ⬜ Add Tempo for distributed tracing (future)
Resources
Now you can see logs from all containers in one place! 🎉