Problem
In some Nubus for Kubernetes environments, the UMS LDAP Server pod may refuse to start after a restart or rescheduling event.
The pod logs may contain messages similar to the following:
mdb_id2entry_put: mdb_put failed: MDB_KEYEXIST: Key/data pair already exists(-30799) "cn=internal"
=> mdb_tool_entry_put: id2entry_add failed: err=-30799
=> mdb_tool_entry_put: txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
slapadd: could not add entry dn="cn=internal" (line=1): txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
mdb_id2entry_put: mdb_put failed: MDB_KEYEXIST: Key/data pair already exists(-30799) "cn=blocklists,cn=internal"
=> mdb_tool_entry_put: id2entry_add failed: err=-30799
=> mdb_tool_entry_put: txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
slapadd: could not add entry dn="cn=blocklists,cn=internal" (line=1): txn_aborted! MDB_KEYEXIST: Key/data pair already exists (-30799)
Further down in the log, the LDAP service may appear to start before shutting down again:
slapd starting
listener initialization failed
daemon: shutdown requested and initiated.
slapd shutdown: waiting for 0 operations/tasks to finish
slapd stopped.
Root Cause
Investigation has shown that the underlying cause can be excessive memory pressure on Kubernetes worker nodes. When a node experiences memory exhaustion, the Linux kernel or Kubernetes may terminate processes in order to maintain node stability.
In affected environments, the decisive evidence was found in the host operating system logs, which contained memory pressure events and process termination messages (SIGTERM).
Under heavy memory pressure, LDAP-related processes may be terminated unexpectedly while performing database operations. This can leave the LDAP database in an inconsistent state, causing subsequent startup attempts to fail with LMDB errors such as:
MDB_KEYEXIST: Key/data pair already exists
Related investigation:
Investigation
Background Information
Two different memory exhaustion scenarios must be distinguished:
Container / cgroup OOM
This occurs when a container exceeds its configured memory limit (resources.limits.memory).
Characteristics:
- Kubernetes records an
OOMKilledevent. - The process receives a
SIGKILL. - The process cannot intercept or log the signal.
- Events are visible through Kubernetes diagnostics.
Example commands:
kubectl describe pod <pod-name>
kubectl get events --field-selector reason=OOMKilling
Node / System OOM
This occurs when the entire Kubernetes node runs out of memory.
Characteristics:
- The Linux kernel global OOM killer selects processes based on their
oom_score. - Memory-intensive processes such as
slapdare common candidates. - Kubernetes workloads may be evicted or rescheduled.
- Application logs often provide little or no useful information regarding the actual root cause.
- Evidence is typically found in node-level system logs.
In the observed cases, the LDAP logs indicated a graceful shutdown rather than an immediate SIGKILL, suggesting that the workload may have been evicted or terminated due to node memory pressure rather than a container-level OOM event.
Diagnosis
Review the Kubernetes node logs and operating system logs for indicators such as:
MemoryPressure
OOM Killer
Out of memory
SIGTERM
Eviction
Additionally, inspect pod events:
kubectl describe pod <ldap-pod>
kubectl get events --sort-by=.metadata.creationTimestamp
Monitor memory consumption across the cluster and compare actual usage with configured resource requests and limits.
Solution
Verify Resource Requests and Limits
Ensure that the LDAP server pods have memory requests and limits configured appropriately for the size of the environment.
Example:
resources:
requests:
memory: <appropriate-value>
limits:
memory: <appropriate-value>
The configured values should exceed the normal operating memory consumption of the pod.
Monitor Resource Utilization
Memory requirements depend on factors such as:
- Directory size
- Number of users and groups
- Replication traffic
- Authentication workload
- Historical database growth
Continuous monitoring should be implemented using Kubernetes observability tooling such as:
- Metrics Server
- Prometheus
- Grafana
Resource requests and limits should be reviewed regularly and adjusted according to actual usage patterns.
Review Cluster Capacity
If multiple workloads compete for limited memory resources, Kubernetes may evict pods or the operating system may terminate processes to protect node stability.
Verify that:
- Worker nodes provide sufficient memory capacity.
- Workloads are evenly distributed across the cluster.
- Overcommitment and overprovisioning are avoided.
- Cluster autoscaling is configured appropriately, where applicable.
Conclusion
The startup failure of the UMS LDAP Server is frequently a secondary symptom rather than the primary problem. In investigated cases, the root cause was memory pressure at the Kubernetes node level, which resulted in the termination of LDAP-related processes and subsequent LDAP database inconsistencies.
The recommended approach is to identify and resolve the underlying cluster resource issue, review memory requests and limits, and implement continuous monitoring to prevent future occurrences.