Ok, let’s start to narrow this down by interpreting the output of samba-tool drs showrepl. The “INBOUND NEIGHBORS” section in the output of samba-tool drs showrepl reports information about directory data replicated from other Samba/AD domain controllers to the local Samba/AD DC where the command is run. The “OUTBOUND NEIGHBORS” on the other hand reports information about data replicated by other Samba/AD DCs. WERR_DS_DRA_ACCESS_DENIED errors in the “INBOUND” section mean, that the local host was unable to authenticate against the other Samba/AD DC when trying to replicate data from a specific partition. As a result, /var/log/samba/log.samba in the local host could show a log message that reports the authentication failure, e.g. something like this:
Failed to bind to uuid a4527054-2b0c-4325-9984-262333d9decd for ncacn_ip_tcp:10.20.30.40[49152,seal,krb5,target_hostname=a4527054-2b0c-4325-9984-262333d9decd._msdcs.company. de,target_principal=GC/smb.company.de/company.de,abstract_syntax=e3514235-4b06-11d1-ab04-00c04fc2dcd2/0x00000004,localaddress=10.20.30.41] NT_STATUS_UNSUCCESSFUL
On the other hand, the /var/log/samba/log.samba on the remote DC , which seems to go by the name “SMB” in your case, could contain a log message that reports the reason of the error. I’ll give an example here: Assuming you have a Master and two Slaves, all three with Samba/AD. If you re-join Slave2 for some reason, the replication to Slave1 may be temporarily broken, because Slave1 still holds a Kerberos Service Ticket for Slave2, that is not valid any longer. As a result samba-tool drs showrepl on Slave1 should show an error in the INBOUND section. Depending on the exact details, this situation can be observed in log.samba on Samba/AD DC Slave2, where a message like this may appear:
GSS server Update(krb5)(1) Update failed: Miscellaneous failure (see text): Failed to find SLAVE1$@COMAPY.DE(kvno 1) in keytab FILE:/etc/krb5.keytab (arcfour-hmac-md5)
Please note that the error message may vary, depending on the exact details and Samba versions.
Now, in this example situation (known as https://forge.univention.org/bugzilla/show_bug.cgi?id=35560 ) it’s easy to get the replication going again by restarting the samba processes on Server1 (/etc/init.d/samba restart).
Then there is the Bug that Dirk Ahrnke mentioned above. My gut feeling is that that’s not what we have here, but you may check this by running the following commands:
ldif=$(univention-s4search objectGUID=a4527054-2b0c-4325-9984-262333d9decd)
dsa_dn=$(echo "$ldif" | sed -n 's/^dn: //p')
server_dn=$(echo "$dsa_dn" | sed -n 's/^[^,]*,//p')
univention-s4search -b "$server_dn" -s base serverReference
The output of the last command should contain a line starting with "serverReference: ". If not, then either my commands above had a problem (there should at least be a line starting with "dn: "), or your really might face the situation described in that Samba Bug report. From reading the Samba Bug report I’m not 100% sure if the attribute would be missing on the local server or on the remote “SMB” server. So, you probably should check on both.