Install hangs during UCS configuration

cbservices · April 7, 2017, 5:15pm

I’ve been looking at server systems for a non-profit I do some work for, and I finally settled on UCS due to features and usability.
I’m trying to install UCS 4.2 on a pair of 3TB hard drives (RAID 1) in an HP Core2 Duo tower.
This is a clean install on brand new hard drives, the rest of the hardware, while used, works fine without any issues.

I’ve been through this install several times using different options, and the same thing happens every time (well, once I got past the BIOS boot partition requirements which caused grub-install to fail):
The base install works fine, all the partitions are set up properly, etc, until I get to the UCS configuration. After I enter in the domain and server names, it proceeds to install and configure packages, gets to about 8% on the progress bar, then goes back to 5% again. It keeps moving, though, until it gets to:

“Configuring libnss-ldap”.

Once it hits this package, it just sits. I’ve left it for an hour at least, and nothing else changes, although the system isn’t locked, as I can still move the mouse, and change virtual terminals with Alt-F1-F5. If I leave it long enough, or try to change virtual terminals after it’s sat for an extended period of time, it eventually seems to move to a screen that says the configuration was not completed, but it should restart after a reboot.

When I reboot it, it gives me the UCS splash screen, with what I assume is supposed to be bootup messages underneath, but they all show up as a series of small squares, like the font it’s trying to use isn’t installed. Eventually, it hangs, and if I hit escape to see the boot messages, it’s got a bunch of green “OK”, and several red “FAILED” messages. I can’t remember what all the failed ones are, and it’s installing again right now, so I can’t check at the moment, but there are some to do with LDAP, and RPC, if I remember correctly.

However, it never actually gets to a point where I can log in, as it seems to hang on the last of the failed startup scripts, and never goes any further.

Any suggestions on where I should look next? I’m thinking of installing 4.1 instead, if I can’t figure this out. Maybe the upgrade to 4.2 will go smoother than a clean install.

cbservices · April 7, 2017, 6:05pm

Ok, so I’ve gone into recovery mode, manually mounted /usr (weird…/usr/local/ was mounted, but /usr wasn’t), and run apt-get install libnss-ldap, and it told me packages were not configured.
So, I ran dpkg --configure, and it’s doing a whole pile of stuff. Been running for 5 minutes already, configuring all the stuff that wasn’t done yet. I’m still noticing some errors about service it can’t connect to, but I’m hoping once all is done and I reboot into normal mode, it’ll work without those errors.

We shall see.

Grandjean · April 7, 2017, 6:08pm

Hello and welcome

Edit: I typed this before I saw your update. Maybe it still provides some value:

I guess the first check should be if the UCS 4.2 ISO got corrupted somehow:

me@host:~$ md5sum UCS_4.2-0-amd64.iso 
d88b190e8d33d41101f89e10b3e64c70  UCS_4.2-0-amd64.iso

The hash value must be the same as in the example above.

Did you copy the ISO to DVD or USB pendrive? It’s also possible that the data got corrupted in this step.

Does the system have access to the internet during installation? (That should not matter, but you never know …)

If you can switch to another virtual terminal via ALT+F1, then you should be able to activate the busybox shell, run chroot /target /bin/bash to start a bash shell inside the installation target and then have a look at /var/log/univention/setup.log - there you might find more information on what’s wrong.

Best regards,
Michael Grandjean

cbservices · April 7, 2017, 6:15pm

Michael,

Thanks for your reply. I burned the ISO to DVD. I’m checking the MD5 of the file right now, if it matches, I’ll check the DVD itself, too.
Incidentally, my filename is UCS-Installation-amd64.iso. It doesn’t have the version number in it.
When I looked at /var/log/univention/setup.log in recovery mode, it ended with the line of configuring libnss-ldap, too. No errors or anything beyond that.

Ok, so the MD5 just finished. The file matches, so I’ll check the DVD and post back when I have that info.

Also, after finishing the dpkg --configure -a and rebooting, no change. Still the same errors, still no way to log in.

cbservices · April 7, 2017, 6:37pm

The DVD checks out, so who knows what’s causing it.

I’m assuming there is a command that installs all the UCS-specific stuff on top of Debian. Is this something that’s installed and I can run manually, or would I have to get a list of packages to manually apt-get them all?

Oh, and to answer your other question, yes, it has Internet access during the install.
I’m thinking I’m going to try installing on a single drive in another machine I have kicking around doing nothing, and see if I have issues with it.

Grandjean · April 7, 2017, 7:08pm

To be honest, it’s not that easy. Using the ISO is the only supported / official way to install UCS (you can also use pre-installed virtual machine images, but that’s of course no option in your case).

Okay. I guess having a look at the following additional logs might be worth a try:

/var/log/dpkg.log
/var/log/apt/term.log
/var/log/apt/history.log

It might also be some bad blocks on the hard drives, but since they are brand new, that’s very unlikely, I guess.

cbservices · April 7, 2017, 8:14pm

Ok, so /var/log/apt/term.log ends with a prompt, where the libnss-ldap package is asking for the LDAP server URI. This doesn’t get passed on to the GUI, so it just sits waiting for input that will never come. Looks like an installer bug, but I’m amazed something like that wouldn’t get caught before release, so I’m thinking that’s not likely.
I’m thinking I might try a text mode install. What do you think?

cbservices · April 7, 2017, 9:06pm

Text mode installation was done, on the same partition/lvm layout as originally (8MB BiosBoot, 512 /boot, the rest LVM on each disk), with each partition formatted as new.

No errors or freezing during the install, until I rebooted.
Somehow, my /boot partition disappeared, so now I get grub rescue complaining about a missing partition. At this point, I’m glad it’s Friday and there’s wine in the fridge…

cbservices · April 7, 2017, 9:57pm

Second attempt at a text mode installation, and it boots up now with fewer errors, but still some.

Probably the most critical one is “Failed to start Login Service,” which of course, explains why I can’t log in.

I also get:
“Failed to start LSB: Set up cgroupfs mounts.”
“Failed to start LSB: Univention process supervision.”
“Failed to start System Logging Service”
“Failed to start Docker Application Container Engine.”

It doesn’t seem to proceed beyond this last error, as it sat there for several minutes before I stopped paying any attention to it at all.

Looks like it’s all SystemD related stuff. I’m definitely thinking a 4.1 install and upgrade to 4.2 at this point. Might even stay on 4.1 to avoid the SystemD issues altogether.

Unless you have any other ideas…

cbservices · April 8, 2017, 4:27pm

Discovered something else. /usr fails to mount. Every single time.

At first, when I noticed that /usr/local was mounted, but /usr wasn’t (only contained the local dir for mounting /usr/local) I thought that /usr/local was mounting first, leaving /usr nowhere to go.

So, I reinstalled again, leaving /usr/local out, so I knew there would be no conflicts. Rebooted into recovery mode (didn’t even try normal) and no /usr. When I mount it manually with mount /usr it works fine, with no errors; it shows up as an entry in /etc/fstab, but it just doesn’t mount.

I’m going to try again with /usr on the / volume, and see what happens.

Grandjean · April 8, 2017, 5:20pm

Ah, okay. That will most probably be the root cause:

http://sdb.univention.de/1386

The systemd implementation doesn’t support a separate /usr

cbservices · April 9, 2017, 4:11am

If that’s the case, then your installer shouldn’t give you a selectable option to mount a partition on /usr, which it does.
systemd is shit. I’ve heard a lot of people say that, even though some reasons seemed a little extreme… Now I have my own personal reasons, and I can agree with those people.

However, that’s not the end of my problems. The combined / and /usr worked for bootup, but when the text install was finished, it goes back into GUI mode to finish configuration, gets to 5% or so, then hangs.
/var/log/apt/term.log again shows that it’s waiting for a prompt to be answered, but this prompt is never passed on to the GUI. Should it be an automatic default, or should the GUI get a dialog that pops up?

I killed the role10 (or was it 10role) process in top on a virtual console, then it continued, but eventually came to other errors, including errors configuring packages in apt/dpkg. Unfortunately, I’d shut down Firefox, too, so I don’t know if it would have displayed anything useful for this.
Rebooting now gives me the proper font for startup messages, rather than the series of squares it did before, but now apache fails to start, because the configure script stores the SSL certificate files in /tmp. (Really? Who thought that was a good idea? I’m assuming it would move it after the installation is finished, but that means if anything goes wrong that requires a reboot before setup is completed, it can never be completed.)

I’m going to try a single install of 4.1, and an upgrade to 4.2 from that, but if anything goes wrong at this point, I’ve already wasted 2 entire days trying to get this thing installed, and I’m going to ditch it for something else entirely.

cbservices · April 9, 2017, 10:04pm

So, the 4.1 installer worked flawlessly, with one exception: the “Install additional software” step failed the first time, but worked fine when I re-ran it. I’m assuming my Internet dropped briefly, as it’s been a touch unstable lately.

I’m currently running the update to 4.2, after updating all 4.1 packages to the latest versions. Hopefully this works fine, and doesn’t cause any new issues. I’ll keep you posted, but I’d say based on this experience that your 4.2 installer needs a fair bit of work.

cbservices · April 10, 2017, 2:20pm

The 4.1 to 4.2 upgrade also worked flawlessly, and the system is now up and running properly, on the exact same hardware that was giving me such grief with a direct 4.2 install.

So, a couple of suggestions:

Remove the option to mount a filesystem at /usr when doing manual partitioning, since this isn’t supported, anyway.
Figure out why the package installation prompts are (sometimes?) waiting silently in the background, so the installation appears to hang.

Also, as an aside, the issue with the separate /usr that’s not supported by systemd, doesn’t make sense to me. The systemd documentation states that it does, in fact, work fine with a separate /usr partition, so why does your implementation fail so completely when it’s set up this way?
Just curious, more than anything.

Grandjean · April 11, 2017, 6:33pm

Hello @cbservices,

I’m glad to hear that

Thanks for your feedback (and for your patience with the UCS 4.2 installation)
I added this to our issue tracker, because I think you are right.
If you are interested, here is the original issue regarding a separate /usr. We already removed /usr from the auto-partition schemes and we detect and block the upgrade from UCS 4.1 if /usr is split from the root partition. Of course it’s always arguable where to draw the line between restricting choices and giving people the possibility to shoot themselves in the foot.

I tried to reproduce this, even with a separate /usr but had no prompts waiting in dpkg or apt logs . Be assured, that we do a lot of installation tests with all kinds of setups and as far as I know, the installation prompts didn’t occur then. Can you provide the relevant log files (if you still have them)? That would help a lot. Also the exact partitioning scheme, so we are able to come as close to your setup as possible.

I think we can agree to disagree Imho there are good reasons to replace the aging sysvinit with a modern init system that has its advantages. And yes, systemd does a lot of things differently and that can be annoying - but that’s nothing bad per se. And to be honest, UCS is one of the last Linux distributions to actually make the switch.
I think there are valid points one can criticize about systemd, just as with every piece of software. The main problem I have with most anti-systemd arguments is that they often boil down to ad hominem attacks against the core developers.

Yeah, it’s a bit complicated. I guess you are referring to Booting Without /usr is Broken written by the systemd people. There they state:

systemd itself is actually completely fine with /usr on a separate file system that is not pre-mounted at boot time. However, the common basic set of OS components of modern Linux machines is not

The important part is: systemd itself is fine, but a lot of Linux components just don’t work if /usr is not mounted before the main boot process. That’s a pity, grown historically, but the current state. Because of this, systemd just doesn’t mount /usr itself, as the article explains:

we now just expect /usr to be pre-mounted from inside the initramfs, to be available before ‘init’ starts

So the solution would be to mount /usr before systemd kicks in, in the so called early user space stage or initramfs. Unfortunately, Debian’s initramfs-tools can’t do this, currently. There has been some work on this, but as far as I can see this is not finished. It would also be possible to run systemd already in initramfs, but that is not implemented either. Another possibility is to use another initramfs, e.g. dracut, that seems to support mounting /usr in early user space - however that would require to also replace the initramfs during upgrade from UCS 4.1 to 4.2 and we would move away from Debian defaults.
As you can see, there are different possibilities, some with unknown consequences. UCS has chosen the one that seems to be the least risky for the majority of users.

Best regards,
Michael Grandjean

cbservices · April 11, 2017, 8:43pm

And I see someone already responded with “go hang yourself with your own rope.”
I’m not registered on the bugtracker, but if you want to respond with:
“It’s not that you can hang yourself, it’s that the installer actively encourages you do to so.
It asks if you want to mount a partition on /usr, and then later says ‘HA! Fooled you! Now your system is borked!’”

I think that will get the point across.

I don’t have the logs, but I do know roughly what the partition scheme was. I had a separate partition for everything, as I like to keep everything separate for troubleshooting and repair purposes later on.

Primary partitions:
bios_boot - 8MB
/boot - 512MB
LVM - rest of 3TB drive

LVM volumes:

/ - 512MB
/home - 40GB
/opt - 100MB
/srv - 1TB
/tmp - 1GB
/usr - 1.4GB
/usr/local - 35MB
/var - 1.5GB
swap1 - 1GB
swap2 - 1GB

If you want to make an init system, then make an init system. I have no problem with that. It’s the taking over of everything else, and absorbing it into sytemd that I have a problem with. Last I heard, (which was a while ago, I admit) he was talking about taking over su functionality, because su was broken. Making a swap-out init system is one thing, but trying to replace everything but the kernel because “it’s all broken” is something else entirely, and it’s well on its way to being the second.

Grandjean:

cbservices:

Also, as an aside, the issue with the separate /usr that’s not supported by systemd, doesn’t make sense to me. The systemd documentation states that it does, in fact, work fine with a separate /usr partition, so why does your implementation fail so completely when it’s set up this way?
Just curious, more than anything.

Yeah, it’s a bit complicated. I guess you are referring to Booting Without /usr is Broken written by the systemd people. There they state:

systemd itself is actually completely fine with /usr on a separate file system that is not pre-mounted at boot time. However, the common basic set of OS components of modern Linux machines is not

The important part is: systemd itself is fine, but a lot of Linux components just don’t work if /usr is not mounted before the main boot process. That’s a pity, grown historically, but the current state. Because of this, systemd just doesn’t mount /usr itself, as the article explains:

we now just expect /usr to be pre-mounted from inside the initramfs, to be available before ‘init’ starts

Ok, I’ll give you that. But to me, creating a system that reads /etc/fstab, and explicity bypasses anything that says /usr isn’t good. That doesn’t just make the odd thing potentially fail; it makes the entire system fail to boot. I’ll give you that it makes it blatantly obvious that something’s wrong, but it also means that extra code had to be written into systemd to check for a /usr mount and ignore it, which is more potential for bugs. It also means that upgrade scripts then have to check for an already-mounted /usr before upgrading, or the system will completely die after the upgrade, which introduces a whole lot more potentially buggy code, individually developed by every different distribution. Not a good situation for the entire Linux ecosystem, if you ask me.
I realize why you did it the way you did, given what you had to work with in systemd, but I don’t think it was completely thought out by the Lennart and crew, before they made that decision. Much better would be a warning on startup that paused for 30 seconds or so, and filled the screen with “A separate /usr partition IS NOT RECOMMENDED. It’s a good idea to merge this into /.”

Anyway, this is all just trying to psychoanalyze a guy I’ve never even met, and has nothing to do with my installation issues anymore, so it’s a bit pointless.

Back to the installation issues: Give that partition scheme a try, and see if you can get a failure. If you can’t, I have a similar SFF machine that has nearly identical hardware to the minitower I was installing on previously, other than the hard drives. I can try on it if you need me to, and see if it fails in the same way.

TimoDenissen · April 12, 2017, 10:03am

@cbservices I edited your post and removed the insults, with which you violated our community guidlines by name calling. See our guidelines for details: https://help.univention.com/faq#agreeable.

This is a warning, not a suspension. Please refrain from calling other people and users names on help.univention.com.