Due to a catastrophic hardware failure, the virtualization host for my primary UCS server died and cannot be restored in a reasonable time. So, I decided to convert my backup UCS server (on a different virtualization host) to become the new primary. I’ll add a new backup UCS server later.
I am using UCS 5.0-1 with the latest updates.
I followed the prevailing documentation for how to do this. Despite the frequent I/O Errors and reboots I was getting on the original primary server during the process, I was very careful to follow every step and keep the artifacts on my independent workstation (like the base.conf, base-forced.conf, dpkg.selection, and ldap-schema files). I followed and carefully double-checked every step as I worked through the documentation.
While I’m happy to share all the things that broke after I permanently shutdown the original primary UCS host and ran the backup2master script, I’ve resolved everything I can up to now. I’d like to focus on this last(?) issue unless you want to explore what else I’ve had to fix (I’m keeping notes).
Right now, I cannot join any computers to the new primary UCS server. When logging into the UCS web console, I’m told (repeatedly) that a “Domain Join” module script needs to be run. It fails when I try and try and try to get it to run. I’ve ultimately found this error in the Domain Join log via the UCS Web console:
RUNNING 98univention-samba4-saml-kerberos.inst
2022-06-14 16:21:34.487832395-05:00 (in joinscript_init)
could not obtain current kerberos secret for sso user
__JOINERR__:FAILED: /usr/lib/univention-install/98univention-samba4-saml-kerberos.inst
EXITCODE=0
Strangely, EXITCODE=0
seems more like a success than a failure, yet the script constantly comes back again and again as Pending.
Regarding the sso user, it is one of several objects that failed during the backup2master process. Strangely, all of the affected objects were in both SAMBA and LDAP. I confirmed every one of them by hand, yet they were throwing errors in /var/log/univention/connector-s4.log. Per the documentation, I manually verified and then removed them all from the sqlite database. However, the sso-user keeps coming back up as failing to sync (usually after trying to re-run 98univention-samba4-saml-kerberos.inst). The stack dump for the sso-user is:
14.06.2022 15:39:05.266 LDAP (PROCESS): sync AD > UCS: Resync rejected dn: 'CN=ucs-sso,CN=Users,DC=redacted,DC=tld'
14.06.2022 15:39:05.275 LDAP (PROCESS): sync AD > UCS: [ user] [ modify] 'uid=ucs-sso,cn=users,dc=redacted,dc=tld'
14.06.2022 15:39:05.289 LDAP (ERROR ): Unknown Exception during sync_to_ucs
14.06.2022 15:39:05.289 LDAP (ERROR ): Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 2299, in __allocate_rid
return self.request_lock('sid', sid)
File "/usr/lib/python3/dist-packages/univention/admin/handlers/__init__.py", line 1693, in request_lock
value = univention.admin.allocators.request(self.lo, self.position, name, value)
File "/usr/lib/python3/dist-packages/univention/admin/allocators.py", line 209, in request
return acquireUnique(lo, position, type, value, _type2attr[type], scope=_type2scope[type])
File "/usr/lib/python3/dist-packages/univention/admin/allocators.py", line 198, in acquireUnique
univention.admin.locking.lock(lo, position, type, value.encode('utf-8'), scope=scope)
File "/usr/lib/python3/dist-packages/univention/admin/locking.py", line 121, in lock
raise univention.admin.uexceptions.noLock(_('The attribute %r could not get locked.') % (type,))
univention.admin.uexceptions.noLock: The attribute 'sid' could not get locked.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/univention/s4connector/__init__.py", line 1498, in sync_to_ucs
result = self.modify_in_ucs(property_type, object, module, position)
File "/usr/lib/python3/dist-packages/univention/s4connector/__init__.py", line 1223, in modify_in_ucs
res = ucs_object.modify(serverctrls=serverctrls, response=response)
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 1503, in modify
return super(object, self).modify(*args, **kwargs)
File "/usr/lib/python3/dist-packages/univention/admin/handlers/__init__.py", line 638, in modify
dn = self._modify(modify_childs, ignore_license=ignore_license, response=response)
File "/usr/lib/python3/dist-packages/univention/admin/handlers/__init__.py", line 1342, in _modify
ml = self._ldap_modlist()
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 1796, in _ldap_modlist
ml = self._modlist_samba_sid(ml)
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 2160, in _modlist_samba_sid
sid = self.__generate_user_sid(self['uidNumber'])
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 2305, in __generate_user_sid
return self.__allocate_rid(self['sambaRID'])
File "/usr/lib/python3/dist-packages/univention/admin/handlers/users/user.py", line 2301, in __allocate_rid
raise univention.admin.uexceptions.sidAlreadyUsed(rid)
univention.admin.uexceptions.sidAlreadyUsed: 1104
I have a bunch of computers that need to be added back to the new UCS primary server. They were all on the original primary but they’ve all been disconnected after backup2master was run. Strangely, LDAP and SAMBA both list them as present, but they are definitely not actually attached.
Thanks much for any and all guidance.