Network installations hangs

Hi

we are currently looking into installing UCS 5.0-1 unattended.
We boot the opsi-linux-bootimage to patch some files for the installation and then integrate them into the initrd.
This worked fine until UCS 4.4
However now we have an issue with the nameserver within the target system inside the installer.
The installer itself runs fine and prepares everything it needs for the installation.
When it changes root to /target the URL updates.software-univention.de is not resolvable.
/etc/resolv.conf contains this uncommented line. The rest of the file is commented.

options timeout:2

EDIT: I checked the reolv.conf in /target. It was fine until some univention package installed and has overwritten the existing resolv.conf with the univention one without the nameserver
EIDTEND

checking with ucr shows that nameserver1 is not set.
When I set this manually with

ucr set nameserver1=IP

the installation will continue.
But this should also work with these preseed settings.
This is the uss part of the preseed

#
# Univention System Setup profile
#

univention-system-setup-boot uss/start/join string true

univention-system-setup-boot uss/hostname string #@hostname*#
univention-system-setup-boot uss/domainname string #@dns_domain*#
univention-system-setup-boot uss/server/role string domaincontroller_master
univention-system-setup-boot uss/ldap/base string #@ldap_base*#
univention-system-setup-boot uss/locale string #@locale*#.UTF-8:UTF-8 en_US.UTF-8:UTF-8
univention-system-setup-boot uss/locale/default string #@locale*#.UTF-8:UTF-8
univention-system-setup-boot uss/organization string #@organisation*#
univention-system-setup-boot uss/windows/domain string #@windomain*#

univention-system-setup-boot uss/interfaces/primary string #@ifdevice*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/start string true
univention-system-setup-boot uss/interfaces/#@ifdevice*#/type string #@if_setup_type*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/address string #@ipaddress*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/netmask string #@netmask*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/network string #@networkaddress*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/broadcast string #@broadcastaddress*#
univention-system-setup-boot uss/interfaces/#@ifdevice*#/ipv6/acceptRA string false
univention-system-setup-boot uss/gateway string #@defaultgateway*#
univention-system-setup-boot uss/nameserver1 string #@uss_nameserver1*#
univention-system-setup-boot uss/nameserver2 string #@uss_nameserver2*#

univention-system-setup-boot uss/timezone string #@timezone*#
univention-system-setup-boot uss/xorg/keyboard/options/XkbLayout string #@language*#

univention-system-setup-boot uss/ssl/organizationalunit string Univention Corporate Server
univention-system-setup-boot uss/ssl/organization string #@organisation*#
univention-system-setup-boot uss/ssl/email string ssl@#@dns_domain*#
univention-system-setup-boot uss/ssl/state string #@country*#
univention-system-setup-boot uss/ssl/locality string #@country*#


univention-system-setup-boot uss/packages_install string #@uss_packages*#
univention-system-setup-boot uss/packages_remove string 
univention-system-setup-boot uss/components string #@uss_components*#
univention-system-setup-boot uss/ad/member string False

univention-system-setup-boot uss/root_password string #@rootpassclear*#
univention-system-setup-boot uss/update/system/after/setup string True

The #@PLACEHOLDER*# values are replaced correctlyy in the opsi-linux-bootimage

Regards
Mathias

Yes, this behavior to overwrite `/etc/resolv.conf´ with the UCR template during the installation while no yet fully configured is a pain.

The bad news: We no longer support the PXE installation out-of-the-box since UCS 5.0-0. We forgot to mention this in the 5.0-0 release notes - we only communicated this internally so far.

So you’re on your own but the code is still there in netcfg to copy the DNS settings from Debian-Installer into the corresponding UCR variables nameserver[123]. But we changes lot of code for handling DHCP, which might have broken PXE.

I would first check /var/log/univention/config-registry.replog if the UCRVs are correctly set by netcfg and if some later process un-sets them.

PS: I don’t know if PXE is your only option to deploy your images, but you could use debootstrap on any Debian based OS to bootstrap UCS. I already looked at virt-install once but no longer remember its state.

Good morning Philipp.

So the netcfg is set in the preseed

d-i netcfg/choose_interface select auto
d-i netcfg/disable_autoconfig boolean true   
d-i netcfg/get_ipaddress string 192.168.16.201   
d-i netcfg/get_netmask string 255.255.0.0
d-i netcfg/get_gateway string 192.168.1.245   
d-i netcfg/get_nameservers string 192.168.1.244   
d-i netcfg/confirm_static boolean true

However in the installed system I cant find any netcfg setting. So this is indeed broken.

We can also do a debootstrap in the bootimage we use. We install can install ubuntu and debian this way. I will see how this might work out.

As I just have seen there is now dedicated debootstrap script for UCS.
Do you know the command to properly debootstrap UCS 5.0-1?

Regards
Mathias

PXE install

I checked it myself and my /var/log/syslog shows the following error:

base-installer: warning: /usr/lib/post-base-installer.d/55netcfg-ucr returned error code 127

That script is missing the hash-bang line #!/bin/sh. I’m not aware of any change in that area but maybe some part of the Debian installer (d-i) will no longer fall back to executing it as a shell script since UCS 5.0-0. Fixing this would require patching netcfg and building a new ISO as netcfg if part of the debian-installer image and cannot be updated online.
(I darkly remember that we observed some bug during UCS 5.0-0 development where the networks related settings were not copied from the D-I environment to the UCS-chroot-environment, but I cannot find that issue.)

I have created Bug #54259 for this.

Workaround

There’s a second script 56ucr wich can be used to set arbitrary UCR variables by duplicating all network relevant USS settings with prefix uss/ replaced by ucr/ in the UCS installation profile, e.g.

d-i ucr/interfaces/primary string #@ifdevice*#
d-i ucr/interfaces/#@ifdevice*#/start string true
d-i ucr/interfaces/#@ifdevice*#/type string #@if_setup_type*#
d-i ucr/interfaces/#@ifdevice*#/address string #@ipaddress*#
d-i ucr/interfaces/#@ifdevice*#/netmask string #@netmask*#
d-i ucr/interfaces/#@ifdevice*#/network string #@networkaddress*#
d-i ucr/interfaces/#@ifdevice*#/broadcast string #@broadcastaddress*#
d-i ucr/interfaces/#@ifdevice*#/ipv6/acceptRA string false
d-i ucr/gateway string #@defaultgateway*#
d-i ucr/nameserver1 string #@uss_nameserver1*#
d-i ucr/nameserver2 string #@uss_nameserver2*#

debootstrap alternative

For debootstrap use the script for Debians sid, basically this:

suite='ucs501'
target="$(mktemp -d)"
mirror='http://updates.software-univention.de'
script='/usr/share/debootstrap/scripts/sid'

debootstrap --no-check-gpg --no-check-certificate \
  --no-merged-usr \
  --include='univention-archive-key' \
  "$suite" "$target" "$mirror" "$script"

Before installing univention-base-files you have to configure the network manually, e.g. by setting UCRV interfaces/$iface/{start,type,address,netmask} and nameserver{1,2,3} and gateway.

Afterwards install these packages:

  • univention-system-setup-boot
  • univention-management-console-web-server
  • univention-management-console-module-setup
  • linux-image-amd64
  • openssh-server
  • univention-base-packages
  • grub-efi-amd64 | grub-pc

Disclaimer: I have not tested this explicitly but copied it from some of my internal scripts I use to prepare our internal docker images and such.

This workaround doesn’t work. The Installations still fails at the same spot with a broken resolv.conf
From the syslog

base-installer: warning: /usr/lib/post-base-installer.d/56ucr returned error code 127
##################### UCS ###########################
#
# Disable starting "Univention System Setup Boot" 
#
d-i ucr/system/setup/boot/start string false
#d-i ucr/system/setup/boot/start string true

d-i ucr/interfaces/primary string #@ifdevice*#
d-i ucr/interfaces/#@ifdevice*#/start string true
d-i ucr/interfaces/#@ifdevice*#/type string #@if_setup_type*#
d-i ucr/interfaces/#@ifdevice*#/address string #@ipaddress*#
d-i ucr/interfaces/#@ifdevice*#/netmask string #@netmask*#
d-i ucr/interfaces/#@ifdevice*#/network string #@networkaddress*#
d-i ucr/interfaces/#@ifdevice*#/broadcast string #@broadcastaddress*#
d-i ucr/interfaces/#@ifdevice*#/ipv6/acceptRA string false
d-i ucr/gateway string #@defaultgateway*#
d-i ucr/nameserver1 string #@uss_nameserver1*#
d-i ucr/nameserver2 string #@uss_nameserver2*#


#
# Univention System Setup profile

I also have the same error in my syslog

However, when the error occours and i start the script
/usr/lib/post-base-installer.d/55netcfg-ucr by hand, the resolv.conf is then properly and name resolving works.

To me it seems that it tries to call /usr/bin/ucr before it is installed and therefore returns command not found

Regards
Mathias

See Bug #54259 comment 5 for the full analysis: With UCS-5 we changed some package priorities and had to update the way univention-config and univention-base-packages are installed. We only updates our internal build script for the ISO images, but forgot to update the profiles for network installation.
The following packages have to be added to the profile:

  • base-installer/includes += univention-config
  • pkgsel/include += univention-base-packages

I’ve updated the extended installation guide to include the updated profile.

So i added these packages to the preseed and it continues with the installation.

However something is different with the grub package.
I am getting a questions what to do with /etc/default/grub as it seems to differ from the installed version.

Strange: univention-grub provides a template for /etc/default/grub but does not depends on grub-pc resp. grub_efi: it leaves the job to figure out the correct version to grub-install, which is executed by debian-installer after pkgsel. At that later point there seems to be a conflict as grub. /var/lib/dpkg/info/grub-pc.postinst contains some non-trivial code to handle such cases, which is hard to debug this way. It might also help if you can provide the output from Unterschiede zwischen den Versionen anzeigen.

Please check your profile for any grub related settings. My profile only has this:

$ grep grub seed/preseed.cfg 
d-i grub-installer/only_debian boolean true
d-i grub-installer/bootdev string default
grub-pc grub-pc/install_devices multiselect
grub-pc grub-pc/install_devices_empty boolean true

I guess you’re doing a fresh install and you install in an empty image: a GRUB from a previous installation might confuse the installer at this stage (due to changed settings).

I have exactly the same settings in the preseed:

cat source/special/unattend/ucs50.cfg | grep grub
d-i grub-installer/only_debian boolean true
d-i grub-installer/bootdev string #@grub_disk*#
grub-pc grub-pc/install_devices multiselect
grub-pc grub-pc/install_devices_empty boolean true

The installation currently happens on the same machine.
The machine gets booted in a minimal linux bootimage, the disk gets partitioned for the patching we need, like injecting a patched preseed into the ucs5.0 initrd and then booted via kexec.
This is the same procedure as with debian/ubuntu etc.

the installation Syslog shows me this

Dec 20 11:28:04 debconf: --> DATA debconf-apt-progress/info description Konfiguration von grub-pc (amd64) wird vorbereitet.
Dec 20 11:28:04 debconf: <-- 0 OK
Dec 20 11:28:04 debconf: --> PROGRESS INFO debconf-apt-progress/info
Dec 20 11:28:04 debconf: <-- 0 OK
Dec 20 11:28:04 debconf: --> DATA debconf-apt-progress/info type text
Dec 20 11:28:04 debconf: <-- 0 OK
Dec 20 11:28:04 debconf: --> DATA debconf-apt-progress/info description grub-pc (amd64) wird konfiguriert.
Dec 20 11:28:04 debconf: <-- 0 OK
Dec 20 11:28:04 debconf: --> PROGRESS INFO debconf-apt-progress/info
Dec 20 11:28:04 debconf: <-- 0 OK
Dec 20 11:28:04 debconf: --> TITLE Konfiguriere grub-pc
Dec 20 11:28:04 debconf: <-- 0
Dec 20 11:28:05 in-target: Use of uninitialized value $template in exists at /usr/share/perl5/Debconf/Template.pm line 86, <GEN2> line 69.
Dec 20 11:28:05 in-target: Use of uninitialized value $item in exists at /usr/share/perl5/Debconf/DbDriver/Cache.pm line 40, <GEN2> line 69.
Dec 20 11:28:05 in-target: Use of uninitialized value $item in exists at /usr/share/perl5/Debconf/DbDriver/Cache.pm line 40, <GEN2> line 69.
Dec 20 11:28:05 debconf: --> DATA ucf/changeprompt type select 
Dec 20 11:28:05 debconf: <-- 0 OK
Dec 20 11:28:05 debconf: --> DATA ucf/changeprompt description Wie wollen Sie mit der geänderten Konfigurationsdatei grub verfahren?
Dec 20 11:28:05 debconf: <-- 0 OK
Dec 20 11:28:05 debconf: --> DATA ucf/changeprompt extended_description grub: Eine neue Version (/tmp/grub.7hJvLMAcWU) der Konfigurationsdatei /etc/default/grub ist verfügbar, aber die derzeit installierte Version wurde verändert.
Dec 20 11:28:05 debconf: <-- 0 OK
Dec 20 11:28:05 debconf: --> DATA ucf/changeprompt choices Version des Paketbetreuers installieren, aktuell lokal installierte Version beibehalten, Unterschiede zwischen den Versionen anzeigen, Unterschiede zwischen den Versionen nebeneinander anzeigen, die Angelegenheit in einer neu gestarteten Shell untersuchen
Dec 20 11:28:05 debconf: <-- 0 OK
Dec 20 11:28:05 debconf: --> SUBST ucf/changeprompt BASENAME grub
Dec 20 11:28:05 debconf: Adding [BASENAME] -> [grub] 
Dec 20 11:28:05 debconf: <-- 0
Dec 20 11:28:05 debconf: --> SUBST ucf/changeprompt FILE /etc/default/grub
Dec 20 11:28:05 debconf: Adding [FILE] -> [/etc/default/grub]
Dec 20 11:28:05 debconf: <-- 0
Dec 20 11:28:05 debconf: --> SUBST ucf/changeprompt NEW /tmp/grub.7hJvLMAcWU
Dec 20 11:28:05 debconf: Adding [NEW] -> [/tmp/grub.7hJvLMAcWU]
Dec 20 11:28:05 debconf: <-- 0
Dec 20 11:28:05 debconf: --> INPUT critical ucf/changeprompt
Dec 20 11:28:05 debconf: <-- 0 question will be asked
Dec 20 11:28:05 debconf: --> GO 
Dec 20 11:36:34 debconf: <-- 0 ok
Dec 20 11:36:34 debconf: --> GET ucf/changeprompt
Dec 20 11:36:34 debconf: <-- 0 Version des Paketbetreuers installieren
Dec 20 11:36:34 debconf: --> DATA debconf-apt-progress/info type text
Dec 20 11:36:34 debconf: <-- 0 OK
Dec 20 11:36:34 debconf: --> DATA debconf-apt-progress/info description grub-pc (amd64) installiert

Thank you for the syslog except, but this does not help. The problem is that /var/lib/dpkg/info/grub-pc.postinst contains some logic to handle the file /etc/default/grub using ucf (Update Configuration File). In comparison to dpkgs internal mechanism to handle configuration files ucf supports a 3-way merge: in addition to the “real” configuration file /etc/default/grub it uses the packaged file /usr/share/grub/default/grub and a copy of the previous version /var/lib/ucf/cache/:etc:default:grub to preserve manual changes even when the upstream file is updated. The maintainer script grub-pc.postinst uses yet another file /var/lib/grub/ucf/grub.previous to also keep track of some changes to allow automatic merging.
The main problem with UCS is that the UCR template file /etc/univention/templates/files/etc/default/grub from univention-grub is yet another copy (5th?) of that file, which has not been updated for a long time.
The installation of GRUB by the Debian Installer is very late and only happens after base-installation has already installed univention-base-files and univention-grub, so UCR already provides /etc/default/grub, which is then sourced by grub-pc.postinst for ucf. It detects (from its point of view) some mis-configuration and shows the interactive dialog to resolve this issue.

In that dialog please select the option

Unterschiede zwischen den Versionen anzeigen

which should show you the changes ucf would merge into the templated /etc/default/grub. From that output I hope to see which setting is that important to grub-pc.postint that it asks for interactive confirmation.

As an alternative you can change to a different console (Alt-F2) while ucf shows the dialog, start a shell there and get into the chroot-environment via chroot /target /bin/bash (you probably will have the English keyboard layout). Within that console you can even start ssh via /etc/init.d/ssh start and then connect to the VM, which simplified getting those files to post them here and send them to us via e-mail:

  • /etc/default/grub
  • /etc/default/grub.ucf-*
  • /usr/share/grub/default/grub
  • /var/lib/grub/ucf/grub.previous
  • /var/lib/ucf/cache/:etc:default:grub

If you want to debug this yourself you can try to add some debug code to /var/lib/dpkg/info/grub-pc.postinst and run it yourself. The main problem with that file is that it uses debconf, which re-directs STDIN/OUT/ERROR for it’s own purposed, so you cannot simply use set -x. Instead change (for example using nano) the first two lines of that file to match this:

#!/bin/bash
PS4='+${BASH_SOURCE}:${LINENO}:${FUNCNAME[0]}@${SECONDS}: '
exec 66>/tmp/log
BASH_XTRACEFD=66
set -x -e

and then execute the file using DPKG_MAINTSCRIPT_PACKAGE=grub-pc DPKG_MAINTSCRIPT_NAME=postinst /var/lib/dpkg/info/grub-pc.postinst configure "". The generated file /tmp/log should contain enough information to follow the logic of the script and to see, why it behaves differently for you than for me.

What virtualization are you using? Qemu with virtio-blk? In the past there were similar issues with GRUB when it detected an installation device different what it currently configured, e.g. UCR has /dev/vd (VirtIO) but the VM hs /dev/sda(SATA,SCSI) or /dev/hda (legacy IDE). Please check your preseed profile for such differences.

Thanks for your detailed answer.

I did select the entry and ran debconf to generate a preseed and found the option

d-i     ucf/changeprompt        select  Version des Paketbetreuers installieren

Adding this into the profile solved the problem with grub, I will take a look to translate this and patch this correctly in the bootimage, just in case someone wants to install UCS 5.0 via netboot in another language.

Thanks for your support

Answering all ucf/chnagepromt with the same answer just quiets the messenger, but does not fix the underlying problem. For now this might be okay and work for you, but this is not a general solution which others should copy without further thought.

Hi,
I know this is a very bad idea to do this. However currently it solves the issue.

Currently we are testing opsi 4.2 with UCS 5.0. Installing opsi on the DC Master works.
Whenever I install a backup or another role things get tricky.
The other roles won’t join to the DC Master. The claim

* Message:  Please visit https://help.univention.com/t/8842 for common problems during the join and how to fix them -- The ssh-login to Administrator@master239-50.ucs.backup-opsi.experimental42 failed with " ". Please make sure the account Administrator exists and is a member of the Domain Admins group!

I can change to the Administrator user on the master when I am connected via SSH. However I cannot login witht he Administrator user and the specified password.

Apart from the changes in the preseed you suggested the installation is basically the same as on ucs44, but it seems the Administrator user is not usable via the Web Interface.

root@master239-50:~# univention-check-join-status 
Warning: 'univention-portal' is not configured.
Warning: 'univention-server-overview' is not configured.
Error: Not all install files configured: 2 missing

fro the join.log

Object exists: cn=ldapacl,cn=univention,dc=ucs,dc=backup-opsi,dc=experimental42
Object exists: cn=udm_syntax,cn=univention,dc=ucs,dc=backup-opsi,dc=experimental42
Object created: cn=univention-portal,cn=ldapschema,cn=univention,dc=ucs,dc=backup-opsi,dc=experimen
tal42

Object created: cn=62univention-portal,cn=ldapacl,cn=univention,dc=ucs,dc=backup-opsi,dc=experiment
al42

Object created: cn=univention-portal,cn=udm_syntax,cn=univention,dc=ucs,dc=backup-opsi,dc=experimen
tal42

Waiting for activation of the extension object univention-portal: .................................
........................ERROR: Primary Directory Node did not mark the extension object active with
in 180 seconds.
ERROR
ucs_registerLDAPExtension: registraton of /usr/lib/univention-portal/schema/univention-portal.schem
a failed.

EDIT:
Looked into /var/cache/univention-system-setup/profile root_password was commented. Uncommented and replaced with cleartext. Ran /usr/lib/univention-system-setup/scripts/setup-join.sh which made univention-check-join.status return a success, login via web interface was possible and a backup could join.

Any suggestion where to look at?

Regards
Mathias