How-To: Check and Fix if Notifier Files are Corrupted

How to check and fix if notifier files are corrupted

Note: Before proceeding make sure you followed this article first.

Notifier uses the files notify/transaction and listener/listener.

It can happen these file get corrupted for various reasons.

Step 0

It is recommended to stop the services during troubleshooting:

systemctl stop univention-directory-listener
systemctl stop univention-directory-notifier
systemctl stop slapd

For safety reason, do a backup of existing files:
tar -C /var/lib/univention-ldap -czpvf /root/replication_backup notify/ listener/ translog/

Step 1

Verify files

Option 1 (recommended for newer versions of UCS)

Verify by tool
For UCS 4.3-errata 470 and UCS 4.4-errata 33 you can use the command univention-translog to verify consistency of the files. The file are ok, when you do not get any output. Otherwise it might look like the following:

/usr/share/univention-directory-notifier/univention-translog check
2019-04-11 12:44:18,999:ERROR:/var/lib/univention-ldap/notify/transaction:2663:'2661 cn=nagios,cn=portal,cn=univention,dc=multi,dc=ucs d\n': Repeated line after '2662 cn=domain,cn=portal,cn=univention,dc=multi,dc=ucs m'
[...]
2019-04-11 12:44:20,825:ERROR:/var/lib/univention-ldap/notify/transaction:92499:'2661 cn=nagios,cn=portal,cn=univention,dc=multi,dc=ucs d\n': Repeated line after '2712 cn=rocketchat,cn=memberserver,cn=computers,dc=multi,dc=ucs m'

/var/lib/univention-ldap/notify/transaction needs fixing:
- the transactions are not sorted uniquely

You can re-run this tool with the option "--fix" in order to try to fix this issue.
See <https://help.univention.com/t/problem-umc-diagnostic-module-complains-about-problems-with-udn-replication/11707/1> for more details.

Option 2 (for older versions)

For UCS prior to 4.3-errata 470 and UCS 4.4 prior to errata 33
The following script can be used (download link: check_files.py (1,5 KB))

#!/usr/bin/env python
import ldap
for transactionfile in ('notify/transaction', 'listener/listener'):
  filepath = '/var/lib/univention-ldap/%s' % transactionfile
  print('Checking %s' % filepath)
  with open(filepath, 'r') as f:
    lc = 0
    for line in f:
      lc += 1
      head_tail = line.strip().split(' ', 1)
      if len(head_tail) != 2:
        print('ERROR missing second column at line %d: "%s"' % (lc, line))
        break
      (id, tail) = head_tail
      try:
        cur_lc = int(id)
      except ValueError:
        print 'ERROR transactionID does not match line cound at line %d: "%s"' % (lc, line)
        break
      head_tail = tail.rsplit(' ', 1)
      if len(head_tail) != 2:
        print 'ERROR missing third column at line %d: "%s"' % (lc, line)
        break
      (dn, opcode) = head_tail
      if not ldap.dn.is_dn(dn):
        print 'ERROR not a valid DN at line %d: "%s"' % (lc, line)
        break
    else:
      print('Syntax OK')
      continue
    break

with open('/var/lib/univention-ldap/notify/transaction', 'r') as transaction:
  print('Checking %s for numbering issues' % filepath)
  lc = 0
  for line in transaction:
    lc += 1
    (id, tail) = line.strip().split(' ', 1)
    try:
      cur_id = int(id)
    except ValueError:
      print 'ERROR at line %d, does not start with an integer number: "%s"' % (lc, line)
      break
    if lc == 1:
      start_id = cur_id
    if cur_id != (lc - 1 + start_id):
      print 'ERROR at line %d, transaction IDs not contiguous: "%s"' % (lc, line)
      break
  else:
    print('Numbering OK')

Output might look like this:

root@master:~# python check_files.py 
Checking /var/lib/univention-ldap/notify/transaction
Syntax OK
Checking /var/lib/univention-ldap/listener/listener
Syntax OK
Checking /var/lib/univention-ldap/listener/listener for numbering issues
ERROR at line 2663, transaction IDs not contiguous: "2661 cn=nagios,cn=portal,cn=univention,dc=multi,dc=ucs d

When copying this script into a terminal or file please make sure to keep the indentation, as the Python programming language depends on this.

Step 2

Fixing issues. The following output is from univention-translog. Using the script is not shown here but this would report similar errors.

Result 1 “not sorted uniquely

2019-04-11 14:28:30,869:ERROR:/var/lib/univention-ldap/notify/transaction:92961:'3122 uid=test29,cn=users,dc=multi,dc=ucs m\n': Repeated line after '3122 uid=test29,cn=users,dc=multi,dc=ucs m'

/var/lib/univention-ldap/notify/transaction needs fixing:
- the transactions are not sorted uniquely

Fix: In this case the tool should be able to fix autmatically by:

/usr/share/univention-directory-notifier/univention-translog check --fix

Result 2 “invalid command

2019-04-11 14:31:28,907:ERROR:/var/lib/univention-ldap/notify/transaction:3122:'3122 uid=test29,cn=users,dc=multi,dc=ucs k\n': Invalid command

/var/lib/univention-ldap/notify/transaction needs fixing:

Fix: Automated fix is not available. Valid commands are “m”, “r”, “a” or “d”. Check the lines and set the correct command. If in doubt set “m” as command.

Result 3 “invalid last id

2019-04-11 14:32:46,120:ERROR:/var/lib/univention-ldap/last_id: Invalid last id: should be 3123, but is 3122

/var/lib/univention-ldap/last_id needs manual fixing!

Fix: Automated fix is not available. Execute these commands:

{
tail -n 1 /var/lib/univention-ldap/notify/transaction
tail -n 1 /var/lib/univention-ldap/listener/listener
} | tail -n 1 | cut -d ' ' -f 1 > /var/lib/univention-ldap/last_id

Result 4 “unparseable lines

You got an error about IDs not being contiguos.
From univention-translog:

2019-04-11 14:22:07,697:ERROR:/var/lib/univention-ldap/notify/transaction:92961:'\xebc\x90\x10\x8e\xd0\xbc\x00\xb0\xb8\x00\x00\x8e\xd8\x8e\xc0\xfb\xbe\x00|\xbf\x00\x06\xb9\x00\x02\xf3\xa4\xea!\x06\x00\x00\n': Invalid line

/var/lib/univention-ldap/notify/transaction needs fixing:
- contains unparseable lines!

Fix: Automated fix is not available. Remove the lines as it does not contain valid information.

Result 5 “pending transactions are not consecutive”

2019-08-23 08:39:14,163:ERROR:/var/lib/univention-ldap/listener/listener:1:'20426 [...]: Not continuous with /var/lib/univention-ldap/notify/transaction: '20423 uid=[...] m'

Fix by adding the “--fix” command to the command:

root@ucs:~# /usr/share/univention-directory-notifier/univention-translog check --fix

Result 6 Any other message or " missing transactions in sequence"

The tool is not sure about the issue as it appears to have broken lines somehow. Remove the lines and retry.
To find these lines easier, you can use the script from this article.

Step 3

If all files are ok but the replication still shows errors check the listener.log file again.

Check 1

You might see messages like this:

27.03.19 17:01:29.455  LISTENER    ( INFO    ) : notifier returned = id:2561	dn:<LDAP>	cmd:*
27.03.19 17:01:29.458  LISTENER    ( ERROR   ) : notifier.c:129:notifier_wait_id_result LDAP failed No such object (32): id:2213
27.03.19 17:01:29.458  LISTENER    ( ERROR   ) : listener: 32
27.03.19 17:01:29.458  LDAP        ( INFO    ) : closing connection

Check 2

Verify the mentioned transaction (id=2213) is not in the LDAP database

root@master:/var/lib/univention-ldap/listener# univention-ldapsearch -LLL "reqSession=2212" -b cn=translog
dn: reqSession=2212,cn=translog
objectClass: auditObject
reqStart: 20201016230816Z
reqSession: 2212
reqDN: cn=member,cn=memberserver,cn=computers,dc=multi,dc=ucs
reqType: m

Check 3

Verify the previous transaction is:

root@master:/var/lib/univention-ldap/listener# univention-ldapsearch  "reqSession=2213" -b cn=translog
# extended LDIF
#
# LDAPv3
# base <cn=translog> with scope subtree
# filter: reqSession=2213
# requesting: ALL
#

# search result
search: 3
result: 0 Success

# numResponses: 1

Step inbetween

if there is just one entry missing between to exisitng entries in translog, than the missing one can be loaded with

/usr/share/univention-directory-notifier/univention-translog load 2213

Check 4

Verify the last ID of the notifier matches:

root@ucs:~# cat /var/lib/univention-ldap/last_id
2301
root@ucs:~# tail -n 1 /var/lib/univention-ldap/notify/transaction
2301 cn=member,cn=memberserver,cn=computers,dc=multi,dc=ucs m

Check 5

If your UCS version already supports (UCS 4.4-1 and newer) the improved univention-translog check command, follow this article.
Otherwise, re-create the listener cache as described here.

Step 4

If you have been able to fix all errors restart the services:

systemctl start univention-directory-listener
systemctl start univention-directory-notifier
systemctl start slapd
Mastodon