On my way to get rid of nagios, I have installed the new Dashboard now. After several tries I found out, that the dashboard needs to be at least a Replica Directory node.
The first thing I noticed was the fact, that connection errors are not showing in the Grafana dashboards but only in the Prometheus interface. See Clientserver is not visble in Dashboard - #4 by riess82 Solved that, but needs some searching around for someone new to the system.
What I don’t know yet: is some alert triggered if a registered client is unreachable? (I didn’t notice any, but possibly was too busy figuring out what to do.)
Next issue: Trying to setup Email alert, I get the following error when sending a test email:
Failed to send Test alert.: SMTP not configured, check your grafana.ini config file’s [smtp] section
Will /var/lib/univention-appcenter/apps/admin-dashboard/conf/grafana/grafana.ini be overwritten by each app update?
I see email reporting as one of the most basic features of the dashboard. Could this be configured during install already? (I don’t think nagios needed smtp configuration, so why does the dashboard need any?)
Another point:
In the Alert Dashboard I see several alerts firing and “> 0 instances” I can expand the instances but see an empty list only showing the headlines “State”, “Labels”,“Created”. Shouldn’t there be some kind of link to see which servers are affected? (Opening the prometheus gui, I can see which servers are firing).
Last point for now:
I have tried to integrate a windows server into the dashboard and installed the windows node exporter. The page loads fine, so i added the target in /var/lib/univention-appcenter/apps/prometheus/conf/custom-targets.json
(Info found here: https://help.univention.com/t/best-practice-monitor-additional-services-with-the-ucs-dashboard/16967)
In prometheus gui, I see a connection error “http: server gave HTTP response to HTTPS client”. the https setting comes from the job definition in prometheus.yml.
This file looks system-created so probably will be overwritten at each update. So no real chance of adding another job there.
Prometheus only supports 1 configuration file (see https://groups.google.com/g/prometheus-users/c/jzjOmxNw30Y), so no way of adding more config via another file.
A suggestion of a proper solution for this would be great. I will work around the problem now by setting up a reverse proxy.
Thank you in advance for suggestions / help.