Can not start VM, some hints how to debug in general

Yesterday I had some problems with my server at home, currently on UCS 4.2 . These are solved and I just want to share my experiences (and write them down in case I need them later by myself).

First I noticed that “/” was mounted “ro”. It is hard to see the reason in the logs, especially when /var has no seperate partition.
After the first restart the system was looping due to a crash at a very early boot phase. Until now I dont know the reason. I just tried different boot option, which finally succeeded with the latest kernel in sysvinit mode. Subsequent reboots using the former default were successful afterwards.

The hardest issue had to be solved then. None of my virtual machines wanted to start. I just got an error saying “AttributeError: ‘NoneType’ object has no attribute ‘lookupByUUIDString’”. Univention Bugzilla has some entries about this, for example https://forge.univention.org/bugzilla/show_bug.cgi?id=35354. There is no specific advise what to do, but I got the idea that something was broken on the libvirt level. As a simple “virsh list” also did not succeed (it was just not doing anything useful), I first tried method “Holzhammer”: reinstall the virtualization components. Even though I removed the configuration files and rebooted after uninstalling/reinstalling I got the same error again. As I had no idea where to look for more informations (the usual logs just showed the same as the UMC, and I had the feeling that inceasing the verbosity will flood me with unrelated stuff) I decided to try another way.
By running strace virsh list I was able to see that waiting for something by reading the file /var/run/libvirt/libvirt-sock like (example)
read(6, " \0\200\206\0\0\0\1\0\0\0B\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0", 32) = ...

some lines above the file was opened:

connect(6, {sa_family=AF_LOCAL, sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0

Now check, which program(s) are working with this socket:

# lsof /var/run/libvirt/libvirt-sock
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
libvirtd 4069 root   11u  unix 0xffff93cd3dc00400      0t0 43517 /var/run/libvirt/libvirt-sock

and check, what it is currently doing (note: in my case I restarted libvirtd before)

strace -f -F -p 4069

This output stated something like “waiting to acquire lock from /var/lib/ebtables/lock” (I did not record the full message). The lockfile existed, but had a timestamp from a couple of hours ago. As I had rebooted several times in between I guessed that this file is just a leftover and deleted it.
Just to be safe I restarted libvirtd again and was finally done as my VMs could be started without further problems.

hth,
Dirk

1 Like
Mastodon