UDN Replication: LISTENER ERROR: non consecutive move

Hello,
This is my situation:
A DC Master UCS 4.4-3 errata413
A DC Backup UCS 4.4-3 errata413
I have a problem with the listener-notifier mechanism:
/usr/lib/nagios/plugins/check_univention_replication
returns:
CRITICAL: no change of listener transaction id for last 0 checks (nid=41854 lid=41814)

/usr/share/univention-directory-notifier/univention-translog check
returns nothing.

univention-s4connector-list-rejected
returns:

UCS rejected
S4 rejected
There may be no rejected DNs if the connector is in progress, to be sure stop the connector before running this script.
last synced USN: 127377

After increasing the debug level of notifier and listener, I find this in the listener log:

06.01.20 14:33:26.502 LISTENER ( ALL ) : cache_get_entry: Read Transaction abort
06.01.20 14:33:26.502 LISTENER ( ERROR ) : non consecutive move: 41815:r:relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com << 41816:m:relativeDomainName=AURELIUS5,zoneName=geosens.com,cn=dns,dc=geosens,dc=41838 cn\3DTITUS,cn=computers,dc=geosens,dc=com
06.01.20 14:33:26.502 LISTENER ( ERROR ) : change_update_dn failed: 1
06.01.20 14:33:26.502 LISTENER ( ERROR ) : listener: 1
06.01.20 14:33:26.502 LDAP ( INFO ) : closing connection

The transaction file (var/lib/univention-ldap/notify/transaction) looks fine, it contains all transactions from id 41815 to 41854 without holes. The first lines look like this:

41815 cn=eli,cn=dc,cn=computers,dc=geosens,dc=com m
41816 cn=default-settings,cn=ldap,cn=policies,dc=geosens,dc=com m
41817 cn=default-settings,cn=ldap,cn=policies,dc=geosens,dc=com m
41818 cn=univention-app,cn=ldapschema,cn=univention,dc=geosens,dc=com m
41819 cn=66univention-appcenter_app,cn=ldapacl,cn=univention,dc=geosens,dc=com m
41820 cn=app_syntax,cn=udm_syntax,cn=univention,dc=geosens,dc=com m

Why is the id 41816 different from the listener log?
Any idea or hint how to fix this? I really appreciate any help.

Update: a reset of the Listener/Notifier according to this forum article was no solution.

I guess this lines from the listener log on the master show the problem, but i don’t know how to fix it:

07.01.20 08:31:38.220 LISTENER ( ALL ) : cache_get_entry: Read Transaction abort
07.01.20 08:31:38.220 LISTENER ( ERROR ) : non consecutive move: 41815:r:relativeDomainName=aurelius,zoneName=geosens.com,cn=dns,dc=geosens,dc=com << 41816:m:relativeDomainName=AURELIUS5,zoneName=geosens.com,cn=dns,dc=geosens,dc=41838 cn\3DTITUS,cn=computers,dc=geosens,dc=com
07.01.20 08:31:38.220 LISTENER ( ERROR ) : change_update_dn failed: 1
07.01.20 08:31:38.220 LISTENER ( ERROR ) : listener: 1

Thank you.

Hi,

you have had this issue two years ago. and in you got a solution how to solve. Why not using it again?

Otherwise using the search might have indicated a possible solution.

Good Luck!

/CV

Correct, I posted the same problem already half a year ago.

Unfortunately I did not find a solution then and had to leave it due to lack of time.
With the latest update it popped up again. I spent several days around new year to fix it, I think I tried every suggested solution on the forum - without success.

I checked / tried these:
Troubleshooting: Listener-/Notifier
Howto Check for Listener/ Notifier Service Status
What to do if a failed.ldif is found
Problem: Unable to (re-)join: 03univention-directory-listener.inst failed
Replication issues UDN
Problems with UDN Replication - Notifier ID and local ID do not match
MDB Database Full and Stops Replication
How to reset Listener / Notifier replication

Still I did not manage to fix. This is why I ask for help again.

Hi,

I am unsure if you haven’t made the issue worse when you simply used all the mentioned attempts…

Use the translog-check tool from my article and tell us the output.

/CV

I hope not that I destroyed more than I repaired. I was careful. But it might be the case, as I obviously do not really understand the listener-notifier mechanism.

The output of /usr/share/univention-directory-notifier/univention-translog check is empty on both Master and Backup.

Still no changes:

root@PDC:~# /usr/share/univention-directory-notifier/univention-translog check
root@PDC:~#
root@PDC:~# /usr/lib/nagios/plugins/check_univention_replication
CRITICAL: no change of listener transaction id for last 0 checks (nid=41887 lid=41814)

root@BDC:~# /usr/share/univention-directory-notifier/univention-translog check
root@BDC:~#
root@BDC:~# /usr/lib/nagios/plugins/check_univention_replication
OK: replication complete (nid=41887 lid=41887)

Just for the record:

As I could not find a solution I just left the problem, there seemed to be no serious consequences anyway.

Now, in March 2020, coronavirus gave me time to come back. To my big surprise, the problems seen to be gone. On both PDC and BDC the checks give no errors.
All I had to do was a rejoin, which worked flawlessly.

This is the first case I can remember when sitting out solved an IT problem :slightly_smiling_face:.
I am grateful for Univention programmers who produce code that makes a system so resilient.
I just upgraded to 4.4-3 errata 499, everything is fine.

:+1::clap:

Mastodon