Problem: high CPU usage due to atd job

Christian_Voelker · June 26, 2020, 1:12pm

Problem

You notice a high CPU load on your UCS server.

Environment

When checking the command top you notice couple of jobs named “atd” and mostly on top of the list consuming CPU.
The atd status shows:

root@ucs:~# /etc/init.d/atd status
● atd.service - Deferred execution scheduler
   Loaded: loaded (/lib/systemd/system/atd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-05-01 19:03:21 IST; 28min ago
     Docs: man:atd(8)
 Main PID: 485 (atd)
    Tasks: 125 (limit: 4915)
   Memory: 21.1M
      CPU: 2h 32min 54.950s
   CGroup: /system.slice/atd.service
           ├─  485 /usr/sbin/atd -f
           ├─19456 /usr/sbin/atd -f
           ├─19460 /usr/sbin/atd -f
           ├─19461 /usr/sbin/atd -f
[...]

May 01 19:31:59 ucs[12329]: Userid 3314 not found - aborting job        9 (a000090193b5e2)
May 01 19:31:59 ucs[12324]: Userid 3314 not found - aborting job       26 (a0001a0193dcf8)

A non-local user with shell access (uid 3314) has scheduled some jobs through the atd service.
When jobs exists during startup of the server atd tries to execute them.

During startup the LDAP service is started in parallel the user is not yet visible to atd and therefore causes issues. This continues and keeps the CPU load on top.

Solution

Step 1

Identify the user who created the jobs and ask him to delete all hist jobs.

root@ucs:~# getent passwd |grep 3314
max.muster:x:3314:5001:Max Musterr:/home/max.muster:/bin/bash

Step 2

Restart the atd.service after startup to allow the service to re-read userids:
systemctl restart atd