Problem: Join Script 91univention-saml Fails

Problem

(Re-) Joining an UCS server to a domain fails in 91univention-saml script.

Environment

1. DNS Backend

You have DNS backend set to ldap:

root@slave:~# ucr get dns/backend
ldap

2. join.log

You will notice the following entries in /var/log/univention/join.log:

2020-03-04 20:49:37,809 INFO    __main__.ucr      Reloading BIND
File: /etc/bind/named.conf.proxy
File: /etc/bind/named.conf.samba4
File: /etc/resolv.conf
authentication error: {'desc': "Can't contact LDAP server"}
authentication error: {'desc': "Can't contact LDAP server"}
authentication error: {'desc': "Can't contact LDAP server"}
authentication error: {'desc': "Can't contact LDAP server"}
2020-03-04 20:50:29.611551933+01:00 (in joinscript_save_current_version)
Configure 91univention-saml.inst Wed Mar  4 20:50:29 CET 2020
2020-03-04 20:50:29.937829500+01:00 (in joinscript_init)
Not updating saml/idp/certificate/privatekey
Not updating saml/idp/certificate/certificate
Not updating saml/idp/entityID
Not updating ucs/server/sso/fqdn
ssh: Could not resolve hostname master.multi.ucs: Name or service not known

__JOINERR__:FAILED: /usr/lib/univention-install/91univention-saml.inst

3. syslog

Additionally you see entries like this in /var/log/syslog:

Mar  5 11:59:12 slave named[42283]: client 127.0.0.1#44165 (multi.ucs): transfer of 'multi.ucs/IN': AXFR-style IXFR started (serial 309)
Mar  5 11:59:12 slave named[42261]: transfer of 'multi.ucs/IN' from 127.0.0.1#7777: failed while receiving responses: CNAME and other data

Note: You might notice issues in resolving new entries on hosts with dns/backend=ldap

Solution

Step 1

Perform on the failing server a zone transfer and compare entries manually:

root@slave:~# dig @localhost -p 7777 axfr multi.ucs| grep CNAME
badname.multi.ucs.	10800	IN	CNAME	master.multi.ucs.
55ea8def-e3d0-4825-8da3-df6f4076a2bf._msdcs.multi.ucs. 10800 IN	CNAME slave.multi.ucs.
9088bd86-0353-426d-a7e6-8f1613443bc0._msdcs.multi.ucs. 80600 IN	CNAME master.multi.ucs.
88befe4b-dc88-4997-b140-1ea3c75eda3d._msdcs.multi.ucs. 10800 IN	CNAME backup.multi.ucs.
univention-repository.multi.ucs. 80600 IN CNAME	master.multi.ucs.

In this case use the following names for the next step:

badname.multi.ucs.
55ea8def-e3d0-4825-8da3-df6f4076a2bf._msdcs.multi.ucs
9088bd86-0353-426d-a7e6-8f1613443bc0._msdcs.multi.ucs
88befe4b-dc88-4997-b140-1ea3c75eda3d._msdcs.multi.ucs
univention-repository.multi.ucs

Step 2

Now compare the above entries with existing A records

root@slave:~# dig @localhost -p 7777 multi.ucs axfr | grep badname
badname.multi.ucs.	10800	IN	CNAME	master.multi.ucs.
badname.multi.ucs.	10800	IN	A	129.168.45.45

You will notice two records (CNAME and A) for the same name which is not allowed and causes the failing zone transfer.

Step 3

Identify the DN for these two entries:

root@slave:~# univention-ldapsearch "relativedomainName=badname*" dn
# extended LDIF
#
# LDAPv3
# base <dc=multi,dc=ucs> (default) with scope subtree
# filter: relativedomainName=badname*
# requesting: dn 
#

# badname, multi.ucs, dns, multi.ucs
dn: relativeDomainName=badname,zoneName=multi.ucs,cn=dns,dc=multi,dc=ucs

# badname.multi.ucs., multi.ucs, dns, multi.ucs
dn: relativeDomainName=badname.multi.ucs.,zoneName=multi.ucs,cn=dns,dc=multi,dc=ucs

# search result
search: 3
result: 0 Success

# numResponses: 3
# numEntries: 2

Step 4

Note: Do this on your master server

Delete the CNAME record (or the A record whichever is wrong)

root@master:~# ldapdelete -x -D "cn=admin,$(ucr get ldap/base)" "relativeDomainName=badname,zoneName=multi.ucs,cn=dns,dc=multi,dc=ucs"  -y /etc/ldap.secret

Step 5

Restart bind9:
root@slave:~# systemctl restart bind9

You should be able to resolve your domain now again.

Mastodon