DRS Replication fails

Your nagios check shows DRS Critical

Samba DRS CRITICAL: 647 failures on MASTER481, 647 failures on SLAVE483

If your nagios check shows any failures you can get further information with the following command which will show some WERR_BADFILE

root@backup482:~# samba-tool drs showrepl
DSA Options: 0x00000001
DSA object GUID: 54a3d4fb-76a8-4983-aedb-c83bae062ea9
DSA invocationId: e7fe4187-e52f-44ec-a912-62483256e8fb


    Default-First-Site-Name\MASTER481 via RPC
        DSA object GUID: 54452074-5088-4b85-b1ef-e2edf184e1c2
        Last attempt @ Wed Jan  4 09:43:37 2017 CET failed, result 2 (WERR_BADFILE)
        2 consecutive failure(s).
        Last success @ Wed Jan  4 09:36:50 2017 CET

The command

samba-tool drs kcc

can be used to manually trigger the Samba 4 “Knowledge Consistency Checker” (KCC) to update its current knowledge about connections to neighbor DCs

We already have an debugging article which can help in the first http://sdb.univention.de/1235 but this article focuses on a currend issue, after joining a server and shows more information, if there is a problem with dns, especially if you installed samba4 on a ucs server before version 4.0-4 and you use for dns/backend samba4

ucr get dns/backend
/usr/share/univention-samba4/scripts/check_essential_samba4_dns_records.sh | grep 'not found'
Host gc._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.gc._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.dc._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.pdc._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.a785bb4a-c9b4-494e-b0a4-33ff8e2ed290.domains._msdcs not found: 3(NXDOMAIN)
Host _kerberos._tcp.dc._msdcs not found: 3(NXDOMAIN)
Host 47fe5d26-ffd2-464a-9a24-0e72af5cdf65._msdcs not found: 3(NXDOMAIN)
Host 54452074-5088-4b85-b1ef-e2edf184e1c2._msdcs not found: 3(NXDOMAIN)
Host 54a3d4fb-76a8-4983-aedb-c83bae062ea9._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs not found: 3(NXDOMAIN)
Host _kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs not found: 3(NXDOMAIN)
Host _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs not found: 3(NXDOMAIN)

You can also check if you have two entries in samba:

root@backup482:~# univention-s4search DC=_msdcs --cross-ncs dn
# record 1
dn: DC=_msdcs,DC=deadlock48.intranet,CN=MicrosoftDNS,DC=DomainDnsZones,DC=deadlock48,DC=intranet
# record 2
dn: DC=_msdcs,DC=deadlock48.intranet,CN=MicrosoftDNS,CN=System,DC=deadlock48,DC=intranet

Record 2 is the problem at the moment, because this entry is created after joining a server in the domain. If the server creates his dns entry the second entry is created as well, but is recognized as zone entry.

As a workaround you can set the dns/backend to ldap on all samba4 servers. But this properties restrict the windows-clients to make their automatic dnsupdates. This is no problem if they are configured with a static ip address or a static dhcp lease.

Deleting the second entry is only a temporary option, because samba_dnsupdate recreates the entry.

ucr set dns/backend='ldap'
/etc/init.d/bind9 restart

After this the result of

/usr/share/univention-samba4/scripts/check_essential_samba4_dns_records.sh  | grep 'not found'

should be empty.