I’m still trying to work out how the Dashboard works behind the scenes, trying to figure out why I have several alarms in a working environment (and nagios reporting all is well).
The one with most servers affected per alert:
ocurring on all primary, backup and replica servers.
I have traced the alert back to the corresponding .prom file, /var/lib/prometheus/node-exporter/check_univention_kpasswdd.prom
But I have not found out (yet) how those prom files are written and what script or daemon actually writes that prom file.
Would be great if someone could tell me if a) the alert is valid and something really is wrong and b) how / when / where those prom files are written, so that I can dig deeper into the problem.
Thank you all in advance.