Production-ready guidelines for AIDE configuration and operation.
✅ DO: Modular service-specific excludes
/etc/aide/aide.conf.d/
├── 10-docker-excludes.conf # Docker overlay FS
├── 15-monitoring-excludes.conf # Prometheus/Grafana
├── 20-postgresql-excludes.conf # PostgreSQL WAL
└── 99-custom-excludes.conf # Site-specific❌ DON'T: Monolithic configuration in single file
Benefits:
- Service-specific excludes can be enabled/disabled independently
- Easier to maintain and version control
- Clear naming convention (10-docker, 20-postgresql, etc.)
✅ DO: Exclude legitimate changing directories
!/var/log
!/var/cache
!/tmp
!/var/lib/docker/overlay2
!/var/lib/postgresql/.*/pg_wal❌ DON'T: Exclude entire /var or /opt
Goal: <1% false-positive rate in production
See: FALSE_POSITIVE_REDUCTION.md
✅ DO: Modern algorithms (sha256, sha512)
# In aide.conf
NORMAL = sha256+sha512❌ DON'T: Deprecated algorithms (md5, sha1)
Trade-off: sha256+sha512 balances security and performance
- Database size: ~20-50 MB (medium filesystem)
- Scan time: 1-3 minutes (NVMe SSD)
Critical files to protect:
| File | Protection | Priority |
|---|---|---|
/usr/bin/aide |
chattr +i |
CRITICAL |
/etc/aide/aide.conf |
chattr +i |
HIGH |
/var/lib/aide/aide.db |
chattr +i |
MEDIUM |
Implementation:
sudo chattr +i /usr/bin/aide
sudo chattr +i /etc/aide/aide.conf
# Optional: Database (prevents updates!)
# sudo chattr +i /var/lib/aide/aide.dbValidation:
./scripts/validate-immutable-flags.sh✅ DO: Use dedicated _aide group
# Directory: 750 root:_aide
drwxr-x--- root _aide /var/lib/aide
# Database: 640 root:_aide
-rw-r----- root _aide /var/lib/aide/aide.db❌ DON'T: World-readable (644) or root-only (600)
Benefits:
- Non-root monitoring tools can read database
- Write access restricted to root
- Audit logging possible via group membership
Validation:
./scripts/validate-permissions.sh monitoring-userProblem: Default tmpfiles.d config resets permissions at boot
✅ DO: Create override in /etc/tmpfiles.d/
# Create override (correct group and permissions)
sudo tee /etc/tmpfiles.d/aide-common.conf > /dev/null << 'EOF'
d /run/aide 0700 _aide root
d /var/log/aide 2755 _aide adm
d /var/lib/aide 0750 _aide _aide
EOF
# Apply immediately
sudo systemd-tmpfiles --create /etc/tmpfiles.d/aide-common.conf❌ DON'T: Rely on manual chmod/chown after each boot
Why Critical:
- Default config:
_aide:root 0700(blocks monitoring access) - Override ensures:
_aide:_aide 0750(persistent across reboots) - Prevents "Permission denied" for monitoring users
Verification:
# Check override exists
ls -l /etc/tmpfiles.d/aide-common.conf
# Verify permissions survive reboot
sudo ls -ld /var/lib/aide/ # drwxr-x--- _aide _aide✅ DO: Use systemd timer for daily updates
# aide-update.timer
[Timer]
OnCalendar=daily
OnCalendar=03:00
Persistent=true❌ DON'T: Manual updates only
Why: Database must stay current with system changes
✅ DO: Increase timeout for large filesystems
# For 2TB+ filesystems
[Service]
TimeoutStartSec=30min❌ DON'T: Use default 90s timeout
Rule of thumb: 1 minute per 100GB filesystem
✅ DO: Export metrics to Prometheus
# aide.prom
aide_check_exit_code 0
aide_database_size_bytes 22000000
aide_check_duration_seconds 83❌ DON'T: Rely on logs only
Alert on:
- Check failures (exit code != 0)
- Missing executions (timer not running)
- Long duration (performance degradation)
Common excludes:
!/var/log # Logs change constantly
!/var/cache # Cache can be recreated
!/tmp # Temporary files
!/var/lib/docker/overlay2 # Docker layers
!/var/lib/postgresql/.*/pg_wal # PostgreSQL WAL
!/opt/monitoring/data # Prometheus time-series
!/home/.*/.cache # User cachesValidation: After adding excludes, check scan time reduction
✅ DO (AIDE 0.18+):
# In aide.conf
num_workers=4 # Use 4 CPU coresTest before/after:
time sudo aide --check✅ DO: Regular offsite backups
# Weekly backup
rsync -av /var/lib/aide/aide.db \
backup-server:/backups/aide/aide.db.$(date +%Y%m%d)❌ DON'T: Only local backups
Why: Database is reference baseline - must survive host failure
✅ DO: Quarterly recovery tests
# 1. Restore database from backup
# 2. Run aide --check
# 3. Verify results match expectations❌ DON'T: Assume backups work without testing
✅ DO: Comment why each exclude exists
# Custom excludes - Site-specific
# Added 2025-12-15: Exclude Nextcloud data directory (high churn)
!/opt/nextcloud/data
# Added 2025-12-20: Exclude local backups (scanned separately)
!/opt/backups/local❌ DON'T: Uncommented excludes (future you won't remember why)
✅ DO: Track configuration changes
# aide.conf header
# Version: 2.0
# Last Updated: 2025-12-15
# Changelog:
# 2025-12-15: Added PostgreSQL WAL excludes
# 2025-12-10: Enabled sha512 hashingRecommended integrations:
- Prometheus: Metrics export (check status, duration, database size)
- Grafana: Dashboards and visualization
- Alertmanager: Alert routing
- Telegram: Real-time failure notifications
Minimal setup:
# Prometheus metrics
/var/lib/node_exporter/textfile_collector/aide.prom
# Telegram on failure
OnFailure=aide-failure-alert.service✅ DO: Use timer, not boot-time service
[Install]
WantedBy=timers.target # NOT multi-user.target❌ DON'T: Block boot sequence with AIDE check
Why: AIDE can take 5-15 minutes - don't delay boot
See: BOOT_RESILIENCY.md
✅ DO: Test on staging environment first
# 1. Deploy configuration
# 2. Run manual check
sudo aide --check
# 3. Trigger false changes
sudo touch /etc/test-file
# 4. Verify detection
sudo aide --check # Should report new file
# 5. Update database
sudo /usr/local/bin/update-aide-db.sh
# 6. Cleanup
sudo rm /etc/test-file❌ DON'T: Deploy directly to production
✅ DO: Monthly validation
# Run validation scripts
./scripts/validate-permissions.sh
./scripts/validate-immutable-flags.sh
# Check timer status
systemctl list-timers aide-update.timer
# Verify last successful run
journalctl -u aide-update.service -n 10- AIDE binary protected with immutable flag
- Configuration protected with immutable flag
- Database permissions set to 640 root:_aide
- Non-root monitoring user in _aide group
- systemd service runs with PrivateTmp=yes
- Update script validated (shellcheck)
- Excludes follow principle of least exclusion
- Modern hash algorithms (sha256/sha512)
- Regular database backups configured
- Recovery procedure tested
- Validation scripts pass
- High-churn directories excluded
- Appropriate timeout configured (30min+)
- Multi-threading enabled (if supported)
- Service runs with low priority (Nice=19)
- Scan time <5 minutes (or acceptable)
- Database size <50MB (or acceptable)
- False-positive rate <1%
Top 5 Critical Practices:
- ✅ Protect AIDE binary with immutable flag
- ✅ Use modular drop-in configuration
- ✅ Automate updates with systemd timer
- ✅ Monitor execution with Prometheus
- ✅ Regular validation with scripts
Avoid These Mistakes:
- ❌ No immutable flag protection
- ❌ World-readable database (644)
- ❌ Blocking boot with AIDE service
- ❌ No monitoring/alerting
- ❌ Manual updates only