Problems with UDN Replication - Notifier ID and local ID do not match

Hello,

I have problems with the notifier / listener.

Setting:
A DC Master UCS 4.4-0 errata155
A DC Backup UCS 4.4-0 errata155

The ldap replication from master to backup works fine.
The problem is on the master. It started with a “system disk full”. Turned out that the notifier.log had > 40GB.
I deleted it and upgraded from errata 137 -> 155, hoping this would fix the problem.

I checked the recommended articles on help.univention.com and tried quiet a lot (hopefully not too much), but I am stuck now.
I have basically worked through https://help.univention.com/t/problem-umc-diagnostic-module-complains-about-problems-with-udn-replication/11707 without sucess.
Here ist the current state:

/usr/lib/nagios/plugins/check_univention_replication
CRITICAL: no change of listener transaction id for last 0 checks (nid=42074 lid=41814)

sv status univention-directory-notifier | sed -n 's/:.*//p'
run

sv status univention-directory-listener | sed -n 's/:.*//p'
finish

univention-directory-listener-ctrl status
Listener status:
finish: univention-directory-listener: (pid 17772) 3s, normally down

Current Notifier ID on “elon.geosens.com
42074

Last Notifier ID processed by local Listener:
41814

Last transaction processed:
42074 zoneName=geosens.com,cn=dns,dc=geosens,dc=com m

service univention-directory-listener stop
service univention-directory-notifier stop
service slapd stop
tar -C /var/lib/univention-ldap -czpvf /root/replication_backup notify/ listener/

/usr/share/univention-directory-notifier/univention-translog check
output:
2019-06-26 11:19:36,936:ERROR:/var/lib/univention-ldap/notify/transaction:242:‘41811 cn=Windows Hosts,cn=groups,dc=geosens,dc=com m\n’: Hole after ‘41806 cn=AURELIUS$,cn=uid,cn=temporary,cn=univention,dc=geosens,dc=com a’
2019-06-26 11:19:36,937:ERROR:/var/lib/univention-ldap/notify/transaction:244:‘41815 relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com r\n’: Hole after ‘41812 cn=aurelius,cn=geosens.com,cn=dhcp,dc=geosens,dc=com r’
2019-06-26 11:19:36,937:ERROR:/var/lib/univention-ldap/notify/transaction:245:‘41839 cn=elon,cn=dc,cn=computers,dc=geosens,dc=com m\n’: Hole after ‘41815 relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com r’

/var/lib/univention-ldap/notify/transaction needs fixing:

  • missing transactions in sequence

You can re-run this tool with the option “–fix” in order to try to fix this issue.
See https://help.univention.com/t/problem-umc-diagnostic-module-complains-about-problems-with-udn-replication/11707/1 for more details.

/usr/share/univention-directory-notifier/univention-translog check --fix
output:
2019-06-26 11:20:56,709:ERROR:/var/lib/univention-ldap/notify/transaction:242:‘41811 cn=Windows Hosts,cn=groups,dc=geosens,dc=com m\n’: Hole after ‘41806 cn=AURELIUS$,cn=uid,cn=temporary,cn=univention,dc=geosens,dc=com a’
2019-06-26 11:20:56,710:ERROR:/var/lib/univention-ldap/notify/transaction:244:‘41815 relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com r\n’: Hole after ‘41812 cn=aurelius,cn=geosens.com,cn=dhcp,dc=geosens,dc=com r’
2019-06-26 11:20:56,711:ERROR:/var/lib/univention-ldap/notify/transaction:245:‘41839 cn=elon,cn=dc,cn=computers,dc=geosens,dc=com m\n’: Hole after ‘41815 relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com r’

/var/lib/univention-ldap/notify/transaction needs fixing:

  • missing transactions in sequence

tail /var/log/univention/notifier.log
26.06.19 11:24:42.091 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:24:48.475 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:24:55.043 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:01.409 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:07.742 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:14.116 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:20.508 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:26.963 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:33.391 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener
26.06.19 11:25:39.818 TRANSFILE ( PROCESS ) : 8 failed, got 0 close connection to listener

tail /var/log/univention/listener.log
26.06.19 11:25:57.975 DEBUG_INIT
26.06.19 11:25:57.984 LISTENER ( WARN ) : Notifier/LDAP server is elon.geosens.com:7389
26.06.19 11:25:57.984 LDAP ( PROCESS ) : connecting to ldap://elon.geosens.com:7389
UNIVENTION_DEBUG_BEGIN : uldap.__open host=elon.geosens.com port=7389 base=dc=geosens,dc=com
UNIVENTION_DEBUG_END : uldap.__open host=elon.geosens.com port=7389 base=dc=geosens,dc=com
26.06.19 11:25:59.250 LISTENER ( PROCESS ) : updating ‘relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com’ command r
26.06.19 11:25:59.250 LISTENER ( PROCESS ) : updating ‘relativeDomainName=AURELIUS5,zoneName=geosens.com,cn=dns,dc=geosens,dc=41838 cn\3DTITUS,cn=computers,dc=geosens,dc=com’ command m
26.06.19 11:25:59.251 LISTENER ( ERROR ) : non consecutive move: 41815:r:relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com << 41816:m:relativeDomainName=AURELIUS5,zoneName=geosens.com,cn=dns,dc=geosens,dc=41838 cn\3DTITUS,cn=computers,dc=geosens,dc=com
26.06.19 11:25:59.251 LISTENER ( ERROR ) : change_update_dn failed: 1
26.06.19 11:25:59.251 LISTENER ( ERROR ) : listener: 1

I ran into this a few weeks ago, in the transaction log file. First stop the notifier, listener and ldap services, back up the transaction file and then check where the numbers are not contiguous.

Where there are entries missing, add the following lines (I’m assuming your base DN) and change ‘number’ to the missing transaction ID

‘number’ dc=geosens,dc=com m

After adding these in for this missing entries, run the script to check the file and if it comes back with no output, start the services again and give it a few moments to kick off replication again.

Thank you gcan.

I just wanted to apply your recommendation when I found that the missing transaction IDs have miraculously been filled with
‘number’ dc=geosens,dc=com m
as you recommended.

Now
/usr/share/univention-directory-notifier/univention-translog check
returns no error.

So now I am back to problem one:
/usr/lib/nagios/plugins/check_univention_replication

CRITICAL: no change of listener transaction id for last 0 checks (nid=42074 lid=41814)

Investigating into the listener log I found this line suspicious:

01.07.19 18:39:14.030  LISTENER    ( ERROR   ) : non consecutive move: 41815:r:relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com << 41816:m:relativeDomainName=AURELIUS5,zoneName=geosens.com,cn=dns,dc=geosens,dc=41838 cn\3DTITUS,cn=computers,dc=geosens,dc=com,

namely this part: cn\3DTITUS

I guess it should be “cn=TITUS”.

But how to correct it?

I have similar problem.

$ sudo tail -f /var/log/univention/listener.log
08.11.19 11:17:48.798  DEBUG_INIT
UNIVENTION_DEBUG_BEGIN  : uldap.__open host=local.domain.com port=7389 base=dc=local,dc=domain,dc=cm
UNIVENTION_DEBUG_END    : uldap.__open host=local.domain.com port=7389 base=dc=local,dc=domain,dc=cm
08.11.19 11:17:51.330  LISTENER    ( ERROR   ) : notifier.c:129:notifier_wait_id_result LDAP failed No such object (32): id:1871148
08.11.19 11:17:51.330  LISTENER    ( ERROR   ) : listener: 32

$grep 1871148 /var/lib/univention-ldap/notify/transaction:
1871148 dc=local,dc=domain,dc=com m

I understand that 1871148 dc=local,dc=domain,dc=com m is kind of fake transaction to fill the hole. As I understand, it should replicate just fine, but it does not.

In addition, I have rejects, must be associated with this ?

$ sudo univention-s4connector-list-rejected                                                                                                                  
                                                                                                                                                                                                             
UCS rejected                                                                                                                                                                                                 
                                                                                                                                                                                                             
    1:   UCS DN: <NORESYNC=broken file:1572733351.360289>;unknown                                                                                                                                            
          S4 DN: <not found>                                                                                                                                                                                 
         Filename: /var/lib/univention-connector/s4/1572733351.360289

$ sudo wc -l /var/lib/univention-connector/s4/1572733351.360289
0 /var/lib/univention-connector/s4/1572733351.360289

I was able to find a solution to my problem here: Problem: no change of listener transaction id for last 0 checks

The command that helped me:
/usr/share/univention-directory-notifier/univention-translog load 1871148