Listener aborts with mdb_txn_commit: failed: MDB_PAGE_NOTFOUND

Problem

The Univention Directory Listener aborts with the following error message in /var/log/univention/listener.log:

cache_update_master_entry mdb_txn_commit: failed: MDB_PAGE_NOTFOUND: Requested page not found

Background:

If an administrator uses the UMC or UDM to delete a reverse DNS zone with all contained resource records, this will trigger a series of changes in OpenLDAP for all resource records in that zone. For each ptr record the object gets removed and then the SOA serial number is incremented in the second step. In case the zone has a sufficiently many records the replication may take some time such that the Listener may find the containing DNS zone object already deleted from the UCS Master OpenLDAP while the Listener is still in the process of replicating the deletion of the ptr records. The Listener is expected to handle this situation gracefully. In rare cases the Directory Listener may fail to perform the final commit, which should save the last processed transaction ID into the Listener cache. To avoid corruption the Listener aborts immediately in this situation. But when the administrator restarts the service, the Listener may just get stuck failing again with the same error.

Solution

While re-joining the UCS system should fix the issue, there may be situations where the administrator doesn’t want to re-join. In cases like that, the following workaround may be useful: The Listener has already replicated the delete of some of the ptr records and the corresponding entries have already been deleted from the Listener cache too. Only the Listener ID needs to be updated to continue to process the next transaction ID. This article explains how to fix this: Howto update Listener CacheMasterEntry. Please note that it is possible that the Listener gets stuck again after the removal of further ptr records. In that case the workaround may need to be applied repeatedly. It’s a good idea to check the listener.log for process.

Mastodon