Univention-monitoring-client - throwing exceptions after upgrade to 5.0.2

Today after upgrading our UCS Installation to 5.0-2 we started getting alerts from cron every 5 minutes.
One issue was manually solved through mkdir -p /var/lib/prometheus/node-exporter.

However we still get stacktraces from check_univention_s4_connector:

raceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_s4_connector", line 75, in <module>
    S4Connector.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_s4_connector", line 71, in write_metrics
    self.debug('Found %d reject(s)! Please check output of univention-s4connector-list-rejected.' % (rejects,))
AttributeError: 'S4Connector' object has no attribute 'debug'
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_s4_connector exited with return code 1
Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 86, in <module>
    CheckSambaDrsRepl.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 65, in write_metrics
    (info_type, info) = drsuapi.DsReplicaGetInfo(self.drsuapi_handle, 1, req1)
TypeError: cannot unpack non-iterable drsuapi.DsReplicaGetInfo object
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures exited with return code 1

Since I’m not quite sure if this is an issue resulting from a particular (mis)configuration on our side, or a general issue - I’m opening this thread (although at least the error-message could be a bit more helpful in finding the misconfiguration).

Glad to provider further information

Thanks !

Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 86, in <module>
    CheckSambaDrsRepl.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 65, in write_metrics
    (info_type, info) = drsuapi.DsReplicaGetInfo(self.drsuapi_handle, 1, req1)
TypeError: cannot unpack non-iterable drsuapi.DsReplicaGetInfo object
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures exited with return code 1

I observed the very same behaviour, any ideas ?

In order to stop the annoying mails I commented part of the file “/usr/share/univention-monitoring-client/scripts/check_univention_samba_drs_failures” starting with line 65 out:

                        drsuapi_connect(self)
                        req1 = drsuapi.DsReplicaGetInfoRequest1()
#                       req1.info_type = drsuapi.DRSUAPI_DS_REPLICA_INFO_REPSTO
#                       (info_type, info) = drsuapi.DsReplicaGetInfo(self.drsuapi_handle, 1, req1)
#                       for n in info.array:
#                               if n.consecutive_sync_failures > 0:
#                                       (site, server) = drs_parse_ntds_dn(n.source_dsa_obj_dn)
#                                       consecutive_sync_failures.setdefault(server, 0)
#                                       consecutive_sync_failures[server] += n.consecutive_sync_failures
                except (CommandError, RuntimeError) as exc:
                        self.write_metric('univention_samba_drs_failures', -1)
                        self.log.debug(str(exc))
                        return

This is probably not the desired solution. Any thoughts ?

Thank you for reporting the problems. A fix has been released yesterday: Security and bugfix errata for Univention Corporate Server

1 Like

@Best after update i’m still getting this errors

Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_nfsstatus", line 93, in <module>
    FSMountCheck.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_nfsstatus", line 86, in write_metrics
    msg = '%s OK, %s %s - %s' % (mounted, umounted, self.errorstate, msg)
AttributeError: 'FSMountCheck' object has no attribute 'errorstate'
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_nfsstatus exited with return code 1

And on my member/managed node I still get 5min-emails:

Traceback (most recent call last):
File "/usr/share/univention-monitoring-client/scripts//check_univention_ldap", line 54, in <module>
LDAP.main()
File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
self.write_metrics()
File "/usr/share/univention-monitoring-client/scripts//check_univention_ldap", line 40, in write_metrics
slapd_port = ucr['slapd/port'].split(',')[0]
AttributeError: 'NoneType' object has no attribute 'split'
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_ldap exited with return code 1
mdb_env_open failed, error 2 No such file or directory

Thanks, we will fix this via this patch: https://github.com/univention/univention-corporate-server/commit/9b4caa0c51c4bf84fbac6daa0e0ce8ea1859794b.patch

We will investigate this. The check makes no sense on a memberserver I guess. Probably a fix is just: https://github.com/univention/univention-corporate-server/commit/015ecbd99cc184e47b55e4a53d4bce4a02d353c4.patch

1 Like

The errors have been fixed last week via Security and bugfix errata for Univention Corporate Server and today we also released an important update Security and bugfix errata for Univention Corporate Server which required to re-execute the joinscripts of the monitoring packages.

Sorry for the inconveniences and thanks for reporting the errors.

@Best I have just upgraded to 5.0.2 and am seeing the same behavior as mschlee:

(3221356597, 'The operation cannot be performed.')
Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 78, in write_metrics
    consecutive_sync_failures = _CheckSambaDrsRepl().check()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 59, in check
    (info_type, info) = self.drsuapi.DsReplicaGetInfo(self.drsuapi_handle, 1, req1)
samba.NTSTATUSError: (3221356597, 'The operation cannot be performed.')

These emails come every five minutes.

The temporary fixed I used was slightly different as is looks like the update changed check_univention_samba_drs_failures

I commented out the try/except block beginning on line 77:

class CheckSambaDrsRepl(Alert):

        def write_metrics(self):
                # return OK, if samba autostart is false
                if not ucr.is_true('samba4/autostart', False):
                        self.write_metric('univention_samba_drs_failures', 0)
                        self.log.debug('samba4/autostart is not true')
                        return

                try:
                       consecutive_sync_failures = _CheckSambaDrsRepl().check()
                except (CommandError, RuntimeError) as exc:
                       self.write_metric('univention_samba_drs_failures', -1)
                #       self.log.exception(str(exc))
                       return

                msg = None
                for server, failures in consecutive_sync_failures.items():
                        text = '%s failures on %s' % (failures, server)
                        msg = msg + ', ' + text if msg else text

                self.write_metric('univention_samba_drs_failures', sum(consecutive_sync_failures.values()))
                self.log.debug(msg or 'no drs failures')

As mschlee said,

This is probably not the desired solution. Any thoughts ?

your solution is acutally correct - if we ignore that error we should not log it to stderr. Otherwise cron will sent regular emails.
The reason why this exception is happening in your environment is unclear to me. This should not be the case.