Problem: UMS LDAP Server Pod Fails to Start with MDB_KEYEXIST Errors

MiracErde · June 24, 2026, 4:13pm

Problem

In some Nubus for Kubernetes environments, the UMS LDAP Server pod may refuse to start after a restart or rescheduling event.

The pod logs may contain messages similar to the following:

mdb_id2entry_put: mdb_put failed: MDB_KEYEXIST: Key/data pair already exists(-30799) "cn=internal"
=> mdb_tool_entry_put: id2entry_add failed: err=-30799
=> mdb_tool_entry_put: txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
slapadd: could not add entry dn="cn=internal" (line=1): txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)

mdb_id2entry_put: mdb_put failed: MDB_KEYEXIST: Key/data pair already exists(-30799) "cn=blocklists,cn=internal"
=> mdb_tool_entry_put: id2entry_add failed: err=-30799
=> mdb_tool_entry_put: txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
slapadd: could not add entry dn="cn=blocklists,cn=internal" (line=1): txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)

Further down in the log, the LDAP service may appear to start before shutting down again:

slapd starting
listener initialization failed
daemon: shutdown requested and initiated.
slapd shutdown: waiting for 0 operations/tasks to finish
slapd stopped.

Root Cause

Investigation has shown that the underlying cause can be excessive memory pressure on Kubernetes worker nodes. When a node experiences memory exhaustion, the Linux kernel or Kubernetes may terminate processes in order to maintain node stability.

In affected environments, the decisive evidence was found in the host operating system logs, which contained memory pressure events and process termination messages (SIGTERM).

Under heavy memory pressure, LDAP-related processes may be terminated unexpectedly while performing database operations. This can leave the LDAP database in an inconsistent state, causing subsequent startup attempts to fail with LMDB errors such as:

MDB_KEYEXIST: Key/data pair already exists

Related investigation:

Bug 59475

Investigation

Background Information

Two different memory exhaustion scenarios must be distinguished:

Container / cgroup OOM

This occurs when a container exceeds its configured memory limit (resources.limits.memory).

Characteristics:

Kubernetes records an OOMKilled event.
The process receives a SIGKILL.
The process cannot intercept or log the signal.
Events are visible through Kubernetes diagnostics.

Example commands:

kubectl describe pod <pod-name>
kubectl get events --field-selector reason=OOMKilling

Node / System OOM

This occurs when the entire Kubernetes node runs out of memory.

Characteristics:

The Linux kernel global OOM killer selects processes based on their oom_score.
Memory-intensive processes such as slapd are common candidates.
Kubernetes workloads may be evicted or rescheduled.
Application logs often provide little or no useful information regarding the actual root cause.
Evidence is typically found in node-level system logs.

In the observed cases, the LDAP logs indicated a graceful shutdown rather than an immediate SIGKILL, suggesting that the workload may have been evicted or terminated due to node memory pressure rather than a container-level OOM event.

Diagnosis

Review the Kubernetes node logs and operating system logs for indicators such as:

MemoryPressure
OOM Killer
Out of memory
SIGTERM
Eviction

Additionally, inspect pod events:

kubectl describe pod <ldap-pod>
kubectl get events --sort-by=.metadata.creationTimestamp

Monitor memory consumption across the cluster and compare actual usage with configured resource requests and limits.

Solution

Verify Resource Requests and Limits

Ensure that the LDAP server pods have memory requests and limits configured appropriately for the size of the environment.

Example:

resources:
  requests:
    memory: <appropriate-value>
  limits:
    memory: <appropriate-value>

The configured values should exceed the normal operating memory consumption of the pod.

Monitor Resource Utilization

Memory requirements depend on factors such as:

Directory size
Number of users and groups
Replication traffic
Authentication workload
Historical database growth

Continuous monitoring should be implemented using Kubernetes observability tooling such as:

Metrics Server
Prometheus
Grafana

Resource requests and limits should be reviewed regularly and adjusted according to actual usage patterns.

Review Cluster Capacity

If multiple workloads compete for limited memory resources, Kubernetes may evict pods or the operating system may terminate processes to protect node stability.

Verify that:

Worker nodes provide sufficient memory capacity.
Workloads are evenly distributed across the cluster.
Overcommitment and overprovisioning are avoided.
Cluster autoscaling is configured appropriately, where applicable.

Conclusion

The startup failure of the UMS LDAP Server is frequently a secondary symptom rather than the primary problem. In investigated cases, the root cause was memory pressure at the Kubernetes node level, which resulted in the termination of LDAP-related processes and subsequent LDAP database inconsistencies.

The recommended approach is to identify and resolve the underlying cluster resource issue, review memory requests and limits, and implement continuous monitoring to prevent future occurrences.