Problem: MySQL Server Does not Start any more

Problem

After an update the database server does not start any more thus leading to Kopano services are all dead.

Environment

With journalct -xe you will see following errors:


systemctl status mariadb.service
● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: signal) since Thu 2018-10-11 07:44:04 CEST; 2s ago
  Process: 8550 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 8548 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
  Process: 8519 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION (code=killed, signal=ABRT
  Process: 8402 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $?
  Process: 8397 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 8395 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
Main PID: 8519 (code=killed, signal=ABRT)
   Status: "Taking your SQL requests now..."
    Tasks: 0 (limit: 4915)
   Memory: 780.0K
      CPU: 1.212s
   CGroup: /system.slice/mariadb.service
 
Okt 11 07:44:04 kopano systemd[1]: mariadb.service: Unit entered failed state.
Okt 11 07:44:04 kopano systemd[1]: mariadb.service: Failed with result 'signal'.
Okt 11 07:44:10 kopano systemd[1]: mariadb.service: Service hold-off time over, scheduling restart.
Okt 11 07:44:10 kopano systemd[1]: Stopped MariaDB database server.

Additionally you might see the following in /var/log/mysql/error.log

Server version: 10.1.26-MariaDB-0+deb9u1
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=0
max_threads=153
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352445 K  bytes of memory
Hope thats ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x5624287f836e]
/usr/sbin/mysqld(handle_fatal_signal+0x3bd)[0x56242833da4d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x110c0)[0x7f77d33230c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcf)[0x7f77d1c1dfff]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f77d1c1f42a]
/usr/sbin/mysqld(+0x94ec5b)[0x5624286e9c5b]
/usr/sbin/mysqld(+0x94fdda)[0x5624286eadda]
/usr/sbin/mysqld(+0x95c043)[0x5624286f7043]
/usr/sbin/mysqld(+0x8e6df4)[0x562428681df4]
/usr/sbin/mysqld(+0x8e8d3d)[0x562428683d3d]
/usr/sbin/mysqld(+0x8afb93)[0x56242864ab93]
/usr/sbin/mysqld(+0x915ea8)[0x5624286b0ea8]
/usr/sbin/mysqld(+0x9057b4)[0x5624286a07b4]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494)[0x7f77d3319494]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f77d1cd3acf]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: innodb_empty_free_list_algorithm has been changed to legacy because of small buffer pool size. In order to use backoff, increase buffer pool at least up to 20MB.
 
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: The InnoDB memory heap is disabled
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Compressed tables use zlib 1.2.8
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Using Linux native AIO
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Using SSE crc32 instructions
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Initializing buffer pool, size = 150.0M
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Completed initialization of buffer pool
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Highest supported file format is Barracuda.
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: The log sequence number 88453027235 in ibdata file do not match the log sequence number 88455735105 in the ib_logfiles!
2018-10-11  7:30:52 140064621376064 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
2018-10-11  7:30:53 140064621376064 [Note] InnoDB: 128 rollback segment(s) are active.
2018-10-11  7:30:53 140064621376064 [Note] InnoDB: Waiting for purge to start
2018-10-11  7:30:53 140064621376064 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.36-82.1 started; log sequence number 88455735105
2018-10-11  7:30:53 140064621376064 [Note] Plugin 'FEEDBACK' is disabled.
2018-10-11  7:30:53 140064621376064 [Note] Recovering after a crash using tc.log
2018-10-11  7:30:53 140064621376064 [Note] Starting crash recovery...
2018-10-11  7:30:53 140064621376064 [Note] Crash recovery finished.
2018-10-11  7:30:53 140063912490752 [Note] InnoDB: Dumping buffer pool(s) not yet started
2018-10-11  7:30:53 140064621376064 [Note] Server socket created on IP: '0.0.0.0'.
2018-10-11  7:30:53 140064621376064 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.1.26-MariaDB-0+deb9u1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Debian 9.1
2018-10-11 07:30:53 7f632f7fb700  InnoDB: Assertion failure in thread 140063975388928 in file btr0cur.cc line 294
InnoDB: Failing assertion: sibling_mode == RW_NO_LATCH || btr_page_get_next(get_block->frame, mtr) == page_get_page_no(page)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
181011  7:30:53 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Solution

The database is broken for some issue. You will need to restore your backup or try to recover the error.

Step 1

Get the root password for your local database by typing:
cat /etc/mysql.secret
Use this password for the later commands (best by copy&paste).

Step 2

To recover, start the mysql database server in recovery mode by setting:
ucr set mysql/config/mysqld/innodb_force_recovery=1

Start server by
systemctl start mariadb

If the server now starts up and stays up you have to dump your databases in step 3. If it does not start up you have to increase the above value of 1 stepwise. Next, use 2, if it still does not start use 3 and so on. Maximum value supported is 6.
Note: Values of 4 or greater can permanently corrupt data files! Use at your own risk and only if you do not have a backup.

Step 3

As your server is now up and running in recovery mode dump your database:

mysqldump -h localhost --opt --lock-all-tables --triggers --routines --flush-logs -u root -p --all-databases --events > mysql.dump

And restore it immediately:

mysql -u root -p  < mysql.dump

This will fix possible index errors in the database and re-create all indexes.
In case you get an error as follows:
ERROR 1036 (HY000) at line 89: Table ‘entries’ is read only
with errors in the log files similar to these:
[ERROR] Can’t init tc log
do the following:

systemctl stop mariadb
mv /var/lib/mysql/tc.log /root/tc.log.bak

And then restart with step 2.
Note: The tc.log is needed for distributed transactions. Usually they are not used by any UCS/ Kopano/ Owncloud apps. You might want to verify this before removing the file.

Step 4

Restart Database in normal operation mode

Reset the recovery parameter by and restart database server:

ucr unset mysql/config/mysqld/innodb_force_recovery
systemctl restart mariadb

Step 5

If you did not succeed with the above steps you might have an error with InnoDB tablespaces. Check your /var/log/mysql/error.log for following lines:

2018-10-11 12:04:01 140018301646592 [ERROR] InnoDB: Failed to find tablespace for table '"mysql"."gtid_slave_pos"' in the cache. Attempting to load the tablespace with space id 3.

In this case you can try to recover the corrupt tablespaces without risk of data loss.

Step 5.1

Backup existing files in case something goes terribly wrong.

systemctl mariadb stop
cd /var/lib
tar czpvf ~/mysql_files.tar.gz mysql/

Step 5.2

Start database with the lowest recovery value possible (raise the level if needed step-by-step) to have your mysql server staying up stable.

ucr set mysql/config/mysqld/innodb_force_recovery=1  # try 2, 3, 4 ,5 ,6 in that order
systemctl start mariadb

Step 5.3

Enable files_per_table (by default on):

mysql -h localhost -u root -p -e "SET GLOBAL innodb_file_per_table=1"

Step 5.4

Drop the tablespace from the above referenced (Step 5) tables and re-import them.

mysql -h localhost -u root -p  -e "alter table mysql.gtid_slave_pos discard tablespace; show warnings;"
mysql -h localhost -u root -p  -e "alter table mysql.gtid_slave_pos import tablespace; show warnings"

Step 5.5

Try mysqldump if it succeeds.

ucr unset mysql/config/mysqld/innodb_force_recovery
systemctl restart mariadb
mysqldump -h localhost --opt --lock-all-tables --triggers --routines --flush-logs -u root - --all-databases --events > mysql.dump

Related Links:
MySQL documentation
Recover From Broken Tablespaces

Mastodon