Mysterious crash of UCS 5

Had not touched my UCS five install for a few weeks. Login, started an update and get some error but I wasn’t paying attention. The vm became nonresponsive. Tried to reboot and it failed ending up in busy box.

Now I need to re-create everything from scratch which is extremely distressing. However I can cope. What I want is to figure out why it happened so I can keep it from happening again?

The virtual environment was xen-ng with xen orchestra as the management console. UCS vm had 6 GB RAM, two CPU cores, minimal configuration, half a dozen machines, DNS and that’s about it. I saved the VM image in case there’s something that can be found by looking at the image.

UCS was installed by importing the VMware OVA image. Would I have been better off using the ISO for installing?

I just want to say a quick thanks for posting this. Many times when I go to update my UCS VM I almost reflexively just hit the update button and then think to myself, I need to snapshot first, but I never have a problem so I can just go ahead and update.

Most of the time I resist the urge and run a snapshot, but I admit I’ve probably skipped it at least a few times. This is the kind of post that will help me be more diligent in the future. I do of course have backups as well, but a snapshot is a lot more convenient and would have less data loss than restoring the nightly backups.

I’m glad you can recover from it. I don’t understand the boot process of linux well enough to offer advice on how to recover from busy box. I think I have managed that at least once in the past, but I’m sure it was a lot of google and reading that helped me do it. Maybe someone else can chime in with some advice on that possibility. Personally I always install from the ISO so I know things install according to what is seen on my particular VM setup. I’m not sure how much that matters, but I’ve never run into any issues with it and it’s pretty much the same process for any distro I want to try out.

We both need to remember to do snapshots first. By recovering, I mean starting over from scratch. I’m going to contact support (since I forgot we were paying for it) and see if I can just copy data files over and not have to rebuild everything.

Of course this all happens when another customer of mine is going through their major Christmas rush. Yay Christmas!

certainly on VM VMXI
you should always shutdown the installation THEN do a snapshot.

I have seen situations where “snapshots” say they are good, but will not restore correctly.
also the databases are better off closed when snap shotting.

Well had another interesting few hours working this problem. First challenge was that the OVA image no longer installed. When I use the manual command it complained saying

ERROR content {“error”:{“code”:-32000,“message”:“Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped?”},“id”:0,“jsonrpc”:“2.0”}

But I’ll file a separate ticket on that one.

I installed from ISO, rebooted multiple times during set up everything was great and then all of a sudden, it would reboot, display the startup screen and then about 30 seconds to a minute later back into busy box.

I filed an official support ticket so hopefully we can find out what’s going on. I’ll update when I get more information