Problem: Troubleshooting slapd during an failed.ldif import

listener
failed-ldif
ldap
slapd
ucs-4-4

#1

Problem:

A failed.ldif is found on your server. You may know this article from the past.

Since the introduction of systemd, there might be complications following the article from above.
So try the following steps to reread the failed.ldif

Step 1

Check the slapd status
systemctl status slapd.service
If this looks like this: Active: active (running)

root@master:~# systemctl status slapd.service 
● slapd.service - LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol)
   Loaded: loaded (/etc/init.d/slapd; generated; vendor preset: enabled)
   Active: active (running) since Wed 2019-05-08 11:25:08 CEST; 1h 4min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 752 ExecStart=/etc/init.d/slapd start (code=exited, status=0/SUCCESS)
 Main PID: 1379 (slapd)
    Tasks: 7 (limit: 4915)
   Memory: 16.0M
      CPU: 5.837s
   CGroup: /system.slice/slapd.service
           └─1379 /usr/sbin/slapd -h ldapi:/// ldap://:7389/ ldaps://:7636/

You an try the reread via systemctl restart slapd.service or /etc/init.d/slapd restart

Step 2

If the import/reread fails, then systemctl status from systemd may looks like this:

systemctl status slapd.service·
● slapd.service - LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol)
   Loaded: loaded (/etc/init.d/slapd; generated; vendor preset: enabled)
   Active: failed (Result: timeout) since Mon 2019-04-15 15:26:58 CEST; 20min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 29566 ExecStop=/etc/init.d/slapd stop (code=exited, status=0/SUCCESS)
  Process: 3654 ExecStart=/etc/init.d/slapd start (code=exited, status=0/SUCCESS)
 Main PID: 15274 (code=exited, status=0/SUCCESS)
    Tasks: 4 (limit: 4915)
   Memory: 12.1M
      CPU: 2.966s
   CGroup: /system.slice/slapd.service
           └─3093 /usr/sbin/slapd -h ldapi:/// ldap://:7389/ ldaps://:7636/

Here you see Active: failed (Result: ) but there is a process3093 /usr/sbin/slapd -h ldapi:/// ldap://:7389/ ldaps://:7636/running.
So systemd does not show and therefor does not stop this process next time. Systemd does not feel responsible anymore.

Step 3

You can now kill this process via pkill slapd
If you want to retry the failed.ldif import you can start and stop slapd without systemd like this;
SYSTEMCTL_SKIP_REDIRECT=1 /etc/init.d/slapd start

Step 4a

If the import of the failed.ldif went well, you have to check if the listener is still running.

ps aufx |grep listener 
root       516  0.0  0.0   4048     0 ?        Ss   11:24   0:00  \_ runsv univention-directory-listener
root     19574  0.0  0.0  14320   968 pts/1    S+   12:20   0:00                  |   \_ grep listener

The listener is not running!
It should look like this:

root       516  0.0  0.0   4048     0 ?        Ss   11:24   0:00  \_ runsv univention-directory-listener
listener  1740  0.0  0.0 2893420  960 ?        S    11:25   0:02  |   \_ /usr/sbin/univention-directory-listener -F -d 2 -b dc=schein,dc=me -m /usr/lib/univention-directory-listener/system -c /var/lib/univention-directory-listener -ZZ -x -D cn=admin,dc=schein,dc=me -y /etc/ldap.secret
root     19574  0.0  0.0  14320   968 pts/1    S+   12:20   0:00                  |   \_ grep listener

You can start the listener via
systemctl restart univention-directory-listener.service

Step 4b

If the import did not succeed remove the failed.ldif file and restart the LDAP-Server (slapd) with Step 3.

mv /var/lib/univention-directory-replication/failed.ldif ~

Step 5

It the failed.ldif has been moved and the slapd still does not start with the same reason (failed.ldif exists) the slapd has an issue with starting.

Possibility A

root@slave:~# SYSTEMCTL_SKIP_REDIRECT=1 /etc/init.d/slapd start
[FAIL] Starting ldap server(s): slapd ...failed.
[info] 5d1f3a4e mdb_db_open: database "dc=local,dc=domain,dc=de" cannot be opened: No such file or directory (2). Restore from backup! 5d1f3a4e backend_startup_one (type=mdb, suffix="dc=local,dc=domain,dc=de"): bi_db_open failed! (2) slap_startup failed.

In this case if the affected system is not a master re-join the machine with univention-join.

Possibility B

Another case could be an empty or corrupted TLS file /etc/ldap/dh_2048.pem.

root@slave:~# killall -9 slapd
root@slave:~# slapd -d 1 -h ldap:///:7389/
[...]
TLS: could not read DH parameters file `/etc/ldap/dh_2048.pem'.
TLS: error:0906D06C:PEM routines:PEM_read_bio:no start line ../crypto/pem/pem_lib.c:686
5d1f5eb7 main: TLS init def ctx failed: -1
5d1f5eb7 slapd destroy: freeing system resources.
5d1f5eb7 shadowbind_db_destroy
5d1f5eb7 slapd stopped.
5d1f5eb7 connections_destroy: nothing to destroy.

You will notive the line TLS: could not read DH parameters file '/etc/ldap/dh_2048.pem'.
The corruption of this file could happen due to a full filesystem or other various reasons. In either case you should re-create the file:
sh -x /usr/share/univention-ldap/create-dh-parameter-files

Restart slapd after this by systemctl start slapd.


closed #2