Problem
The Primary Node was upgraded to UCS 5.2-5 successfully.
While the upgrade on the Primary Node completed without issues, the slapd.service failed to start on the Backup Node and several UCS@school Replica Nodes. The reason was that UVMM-related ACL and schema references were still present in the generated slapd.conf. This resulted in the following LDAP startup error:
unknown attr "@univentionVirtualMachine"
As a consequence, numerous failed.ldif entries were also generated on the Backup Node and the (UCS@school) Replica Nodes.
The affected Backup and Replica Nodes were powered off during the upgrade process. After they were started again, the slapd.service no longer started on the UCS 5.0-10 systems.
On the DC Backup Node, the UVMM schema was still installed. UCS@school deploys an LDAP ACL that optionally references the UVMM schema. Because the corresponding replication transactions were missed while the systems were offline, outdated UVMM-related ACL templates remained on the affected systems.
Several failed.ldif entries were present on the Backup Node. All entries had been created while the affected systems were offline.
Example log output:
Jun 06 13:31:05 replica-node slapd[3367]: connections_destroy: nothing to destroy.
Jun 06 13:31:05 replica-node slapd[3355]: Starting ldap server(s): slapd ...failed.
Jun 06 13:31:05 replica-node slapschema[3370]: Loaded metadata from "/usr/share/univention-management-console/saml/idp/ucs-sso.example.net.xml"
Jun 06 13:31:05 replica-node slapschema[3370]: No trusted audiences configured
Jun 06 13:31:05 replica-node slapschema[3370]: oauthbearer_client_plug_init() failed in sasl_server_add_plugin(): error when parsing configuration file
Jun 06 13:31:05 replica-node slapschema[3370]: _sasl_plugin_load failed on sasl_server_plug_init for plugin: oauthbearer
Jun 06 13:31:05 replica-node slapd[3355]: /etc/ldap/slapd.conf: line 335: unknown attr "@univentionVirtualMachine" in to clause <access clause>
Jun 06 13:31:05 replica-node systemd[1]: slapd.service: Control process exited, code=exited, status=1/FAILURE
Jun 06 13:31:05 replica-node systemd[1]: slapd.service: Failed with result 'exit-code'.
Jun 06 13:31:05 replica-node systemd[1]: Failed to start LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol).
The affected systems also contained a large number of replication failures:
wc -l /var/lib/univention-directory-replication/failed.ldif
1187 /var/lib/univention-directory-replication/failed.ldif
Root Cause
Old UVMM packages and the LDAP ACL templates installed by these packages remained on the affected nodes.
When UVMM packages and templates are removed from the Primary Node, the corresponding changes are distributed through the Listener/Notifier replication mechanism. Nodes that are powered off during this time do not receive these replication transactions.
These missing transactions can lead to replication inconsistencies and gaps that become visible in the output of:
univention-directory-listener-ctrl status
Example:
Current Notifier ID on "primary.example.net"
3873646
Last Notifier ID processed by local Listener:
3805937
As a result, outdated UVMM ACL templates remain on the affected nodes. During LDAP startup, these ACLs reference the removed UVMM schema attributes and cause slapd.service to fail with the error:
unknown attr "@univentionVirtualMachine"
This issue is tracked in the following bug report:
Bug Report: # 59502
Investigation
Check slapd.conf for UVMM References
grep -i uvmm /etc/ldap/slapd.conf
Check the Replication Status
univention-directory-listener-ctrl status
or
/usr/lib/nagios/plugins/check_univention_replication
Verify LDAP Database Consistency
Run:
slapschema
If no errors are reported, restart the services:
systemctl restart slapd.service
systemctl restart univention-directory-notifier.service
systemctl restart univention-directory-listener.service
Check for Existing failed.ldif Files
Inspect existing replication failures. Depending on the contents, they may block the startup of slapd.service.
wc -l /var/lib/univention-directory-replication/failed.ldif
less /var/lib/univention-directory-replication/failed.ldif
Check UVMM Package Status
dpkg -l | grep -i uvmm
Search for Remaining UVMM Templates
grep -rl "uvmm" /etc/univention/templates/info/ /etc/univention/templates/files/
Check for Existing or Hanging slapd Processes
ps aux | grep slapd
Terminate any stale processes if necessary:
kill <PID>
Solution
Remove all remaining UVMM packages from the affected nodes.
Important: The LDAP ACL template files are not automatically removed when the package is uninstalled. These files must be removed manually.
Remove the remaining UVMM packages:
univention-remove univention-management-console-module-uvmm*
Remove the obsolete template files:
rm /etc/univention/templates/files/etc/ldap/slapd.conf.d/66univention-ldap-server_acl-master-uvmm.acl
rm /etc/univention/templates/info/ldapacl_66univention-ldap-server_acl-master-uvmm.acl.info
Regenerate the LDAP configuration:
ucr commit /etc/ldap/slapd.conf
Restart the LDAP service:
systemctl restart slapd.service
Verify the service status:
systemctl status slapd.service
If the restart of slapd.service still fails, an existing failed.ldif may be blocking the startup process. In this case, inspect and remove the problematic entries as described in the following article:
Final Verification
After LDAP has been restored successfully, verify the replication status again:
univention-directory-listener-ctrl status
or
/ usr/lib/nagios/plugins/check_univention_replication
If replication is still not synchronized, restarting the Listener service may help:
systemctl restart univention-directory-listener.service
The replication status should eventually show that the local Listener has processed the current Notifier ID and that all nodes are synchronized again.