Problem
You notice a high CPU load on your UCS server.
Environment
When checking the command top
you notice couple of jobs named “atd” and mostly on top of the list consuming CPU.
The atd status shows:
root@ucs:~# /etc/init.d/atd status
● atd.service - Deferred execution scheduler
Loaded: loaded (/lib/systemd/system/atd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-05-01 19:03:21 IST; 28min ago
Docs: man:atd(8)
Main PID: 485 (atd)
Tasks: 125 (limit: 4915)
Memory: 21.1M
CPU: 2h 32min 54.950s
CGroup: /system.slice/atd.service
├─ 485 /usr/sbin/atd -f
├─19456 /usr/sbin/atd -f
├─19460 /usr/sbin/atd -f
├─19461 /usr/sbin/atd -f
[...]
May 01 19:31:59 ucs[12329]: Userid 3314 not found - aborting job 9 (a000090193b5e2)
May 01 19:31:59 ucs[12324]: Userid 3314 not found - aborting job 26 (a0001a0193dcf8)
A non-local user with shell access (uid 3314) has scheduled some jobs through the atd service.
When jobs exists during startup of the server atd tries to execute them.
During startup the LDAP service is started in parallel the user is not yet visible to atd and therefore causes issues. This continues and keeps the CPU load on top.
Solution
Step 1
Identify the user who created the jobs and ask him to delete all hist jobs.
root@ucs:~# getent passwd |grep 3314
max.muster:x:3314:5001:Max Musterr:/home/max.muster:/bin/bash
Step 2
Restart the atd.service after startup to allow the service to re-read userids:
systemctl restart atd