Boot Problems with Raid (mdadm): Boots into Emergency Mode


#1

Hi,

i have a univention slave system with version 4.3-1 errata145.
I have added a box with 4 hdds via USB 3.0 and created a RAID with mdadm.
So far everything running fine.
How i rebooted the server and i didn’t came up. So i checked the output on the display and it shows that it booted into emergency mode. It old me to check the journal:
IMG_20180722_211531%201
It complains about the raid.
When i check the cat /proc/mdstat, then the raid looks fine.
I have removed the raid and the mdadm.conf and run the update-initramfs -u -k all command again. Now its booting again, but i don’t have my raid anymore.
The raid is for me not critical for booting so what do i need to do that it still boots even if there is a problem with the raid?
Thank you very much for your help!
Kind Regards,

Tobias Lorentz


#2

It show I/O Error on the raid. You are sure it is configured properly?

Have you included it already in your /etc/fstab? If yes, set it up again without adding it to /etc/fstab.

And verify the boot order in BIOS.

Am I right you have locally attached HDDs where UCS is running and booting from? You just add your additional USB box to the computer, right? How is it going when you just attach the USB box without creating a RAID? Can you access all five (?) disks?


#3

Hi,

it had been running properly, but before i rebooted it had lost its drives and i was not able to reattach them.Therefore i tried to reboot, because it told me when i tried to readd them to /dev/md127, that they are alrady used in /dev/md127…
I had included the /dev/md127 already in the /etc/fstab.
Yes, my ucs has a locally attached SSD where everything else is running from. In addition i want to use the USB box for shares.
When i am doing a lsblk i see all the 4 drives of my RAID5 (sdb, sdc, sdd, sde)

Raid was running for 3 days without problems, but then lost their drives again. Don’t know why this happens…

Kind Regards,

Tobi


#4

Well, your RAID issues are not really related to UCS. It’s a basic issue.

But to prevent UCS to fail during boot when the RAID is not available you should add and option to your /etc/fstab

Before (example):
/dev/md127 /mnt defaults ext4 0 0
use
/dev/md127 /mnt defaults,errors=continue ext4 0 0

In this case the UCS will start fine even when your RAID is not available. Then you can loging as usual and do troubleshooting why the RAID failed.

/KNEBB


#5

Thank you very much!


#6

Hi,

thank you very much for your hint.
I have done this, but the system still boots into emergency mode when the raid is not present…
What else could i change to avoid this? System should boot always even when the raid isn’t there…

Kind Regards,

Tobias Lorentz


#7

To verify it is related to /etc/fstab uncomment the line by adding a “#” in front of the line. Thus, your RAID will never get automatically mounted but we can see if it is related somehow or the emergency mode starts due to a different issue.

And you could post your /etc/fstab here so we can re-check it (and please, cut&paste, not a screenshot!)

/KNEBB


#8

Hi,

when i comment out the line in fstab, then it boots up.
Here is the content of the /etc/fstab:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/mapper/vg_ucs-root	/	ext4	errors=remount-ro,user_xattr	0	1	
# /boot was on /dev/sda1 during installation
UUID=82036816-54ad-439c-8c0c-a8966c971593	/boot	ext2	defaults,user_xattr	0	2	
/dev/mapper/vg_ucs-swap_1	none	swap	sw	0	0	
/dev/sdb1	/media/usb0	auto	rw,user,noauto	0	0
#/dev/md127	/media/MD127_DATA	ext4	defaults,errors=continue	0	0

Kind Regards,

Tobias Lorentz


#9

Looking good.

You have to try. Might be the “default” statement is wrong. So remove it and just leave the errors=… in there.

see if it works.