Transaction file checking

ucs-3
ucr
ucs-4
transaction
replication

#1

Problems with replication may be caused by a corrupt transaction file
The transaction file is found in

/var/lib/univention-ldap/notify/transaction

The following mutations of the transaction file cause replication issues:

  • interruption of the consecutive numbering
  • lines start with 0
  • lines start with or contain unusual characters

A corrupt transaction file causes the notifier to “wait at” the corrupt line. Therefore replication stops at the operation listed in that line and no further changes are replicated.

How can these mutations be found

The following Python script finds the first mutation in the file. Please note that the file may be affected by multiple error types, that’s why the script should be run again after fixing an issue:

#!/usr/bin/env python
with open('/var/lib/univention-ldap/notify/transaction', 'r') as transaction:
  lc = 1
  for line in transaction:
    (head, tail) = line.strip().split(' ', 1)
    try:
      cur_lc = int(head)
    except ValueError:
      print 'ERROR at line %d: "%s"' % (lc, line)
      break
    if cur_lc != lc:
      print 'ERROR at line %d: "%s"' % (lc, line)
      break
    lc += 1

How can the file be repaired

1. Missing IDs can be re-generated with the following command

service univention-directory-notifier stop
service univention-directory-listener stop
service slapd stop
# Now re-enumerate the transaction file:
cut -d ' ' -f 2- /var/lib/univention-ldap/notify/transaction | awk '{print NR " " $0}' > /var/lib/univention-ldap/notify/transaction.new
mv /var/lib/univention-ldap/notify/transaction /var/lib/univention-ldap/notify/transaction.bak
mv /var/lib/univention-ldap/notify/transaction.new /var/lib/univention-ldap/notify/transaction
# And set last_id to the last value:
tail -n 1 /var/lib/univention-ldap/notify/transaction | awk '{print $1}' > /var/lib/univention-ldap/last_id
service univention-directory-notifier start
service slapd start
service univention-directory-listener start

After that, a re-join of all UCS Servers should be done.

2. Discontinuity in the numbering can be refilled by editing the transaction file manually

If the transaction file has an discontinuity in the numbering, it can be refilled by adding a modify command at the ldap/base for each missing number in the transaction file.

Open the transaction file with your preferred editor and navigate to the line you’ve identified using the script above. There you should now see the missing numbers. You can fill the missing numbers manually like this:

Please replace “dc=foo,dc=bar” with your own ldap/base. (ucr get ldap/base)

2658985 dc=foo,dc=bar m
2658986 dc=foo,dc=bar m
2658987 dc=foo,dc=bar m
2658988 dc=foo,dc=bar m
2658989 dc=foo,dc=bar m
2658990 dc=foo,dc=bar m
2658991 dc=foo,dc=bar m
2658992 dc=foo,dc=bar m

After editing the transaction file, make sure the notifyID is equal to the last entry in the transaction file

echo -n  2658992 > /var/lib/univention-directory-listener/notifier_id

A large transaction-file could cause the join of new UCS systems to take very long or blocks the start of the univention-directory-notifier daemon.

See also SDB article 1296.


Transaction Log kürzen
Warning: Check for problems with UDN replication
Hohe Auslastung univention-directory-notifier