Problem: UCS 4.3-4 Samba 4.10 Update aborts during database re-indexing (No such object)


#1

Situation:

You have updated from Samba 4.7 to Samba 4.10 and the automatic reindex has failed with the following message in the updater.log:

Reindexing: re_index successful on /var/lib/samba/private/sam.ldb.d/DC=DOMAIN,DC=NET.ldb, final index write-out will be in transaction commit
ltdb: tdb(/var/lib/samba/private/sam.ldb.d/DC=DOMAIN,DC=NET.ldb): tdb_recovery_allocate: overflow recovery area

ltdb: tdb(/var/lib/samba/private/sam.ldb.d/DC=DOMAIN,DC=NET.ldb): tdb_transaction_prepare_commit: failed to setup recovery data

Failure during prepare_write): Record does not exist -> No such object
ltdb: tdb(/var/lib/samba/private/sam.ldb.d/DC=DOMAIN,DC=NET.ldb): tdb_transaction_cancel: no transaction

re-indexed database : (32, 'prepare_commit error on DC=domain,DC=net: Failure during prepare_write): Record does not exist -> No such object')

Solution:

You need to have at least two Samba/AD DCs in your domain to follow the procedure described below. While there are several variations of how to proceed, the following procedure is the most generic way, irrespective of the particular dimensions of your domain.

As a first step, you should install a new UCS server with Samba/AD, e.g. a DC Backup. Please select a UCS 4.3-4 installation medium for that. It doesn’t matter if you choose to install a new DC Slave instead of a DC Backup. We’ll assume a DC Backup in the text below. To have a DC Backup is recommended anyway in case the DC Master fails.

During installation of the new system it’s important to leave the default option activated, to update the system to the latest erratum. This makes sure that the new DC has the Samba 4.10 packages installed already.

Before going through the details, here’s the big picture of this migration plan:

The Samba backend database needs to be converted after update from Samba 4.7 to Samba 4.10. The reason is, that Samba changed the primary key of the databases from DN (variable length) to objectGUID (fixed length). This in turn requires the database index (which mapps attribute values back to objects) to be updated as well. You are consulting this article because the in place migration failed due to the size restriction of the TDB database (which is the technological basis of the LDB database files). First the good news: the Database is not broken, it’s simply still in the old database format, which needs conversion for Samba 4.10. So, the plan is to simply first downgrade the Samba packages back to 4.7 on the affected systems. That allows Samba to start again normally. Once that’s done, you will use the new Samba 4.10 System to clone the Samba databases of the affected systems, one system at a time, and while cloning them, they get converted into the new Samba 4.10 format. The cloned database is not active and can simply be copied to the original UCS server it came from. Once that’s done, each UCS server can be updated one at a time by stopping the Samba processes, moving the new format database files into the correct place, replacing the old format ones and then running univention-upgrade to update the programm packages accordingly. Sounds reasonable? Then let’s get to it.

Note: In UCS@school domains this procedure should be followed only on the Samba/AD DCs in the central school department, but not on the UCS@school Slave PDCs. The steps required for the UCS@school Slave PDCs is described in a section below. The order of migration doesn’t matter though. If you happen to have a UCS@school Slave PDC affected by the update issue, you may as well fix that one first. We recommend reading the full description first though, to fully understand the proceedure.

Step One: Get Samba 4.7 running again on the domain controller that hit this issue.

You should log on to each of the Samba/AD DCs that have already run into this issue and downgrade the Samba packages, so you can start Samba 4.7 again normally by running “/etc/init.d/samba start” (FYI: don’t use systemctl of “service” for samba).

The following commands should get Samba 4.7 running again:

/etc/init.d/samba stop

OLD_PKG_VERSION_SAMBA='2:4.7.8-1A~4.3.0.201905081755'
OLD_PKG_VERSION_LDB='2:1.2.3-1A~4.3.0.201801031047'
OLD_PKG_VERSION_TDB='1.3.15-2A~4.3.0.201712121753'
OLD_PKG_VERSION_TALLOC='2.1.11-1A~4.3.0.201802090138'
OLD_PKG_VERSION_TEVENT='0.9.36-1A~4.3.0.201808081834'

univention-install \
        libsmbclient="$OLD_PKG_VERSION_SAMBA" \
        libwbclient0="$OLD_PKG_VERSION_SAMBA" \
        python-samba="$OLD_PKG_VERSION_SAMBA" \
        samba="$OLD_PKG_VERSION_SAMBA" \
        samba-common="$OLD_PKG_VERSION_SAMBA" \
        samba-common-bin="$OLD_PKG_VERSION_SAMBA" \
        samba-dsdb-modules="$OLD_PKG_VERSION_SAMBA" \
        samba-libs="$OLD_PKG_VERSION_SAMBA" \
        samba-vfs-modules="$OLD_PKG_VERSION_SAMBA" \
        smbclient="$OLD_PKG_VERSION_SAMBA" \
        winbind="$OLD_PKG_VERSION_SAMBA" \
        ldb-tools="$OLD_PKG_VERSION_LDB" \
        libldb1="$OLD_PKG_VERSION_LDB" \
        python-ldb="$OLD_PKG_VERSION_LDB" \
        libtdb1="$OLD_PKG_VERSION_TDB" \
        python-tdb="$OLD_PKG_VERSION_TDB" \
        tdb-tools="$OLD_PKG_VERSION_TDB" \
        libtalloc2="$OLD_PKG_VERSION_TALLOC" \
        python-talloc="$OLD_PKG_VERSION_TALLOC" \
        libtevent0="$OLD_PKG_VERSION_TEVENT"

Now Samba 4.7 should be running again and we can continue with the next steps.

Step two: Install a new UCS 4.3-4 DC Backup with the latest Errata

You need to install and join the new UCS 4.3-4 Backup at this point. Please make sure to upgrade to the latest errata version (errata 542 is sufficient) during installation. It is imporant to have the system to date before joining, because it needs to run Samba 4.10 during the join.

Step three: Stop the S4-Connector to avoid unwanted changes during migration.

To avoid external modifications of the Samba/AD data first stop the central S4-Connector, by logging onto that system and running /etc/init.d/univention-s4-connector stop.

You may run the following command to find out the hostname of the system running the S4-Connector:

. /usr/share/univention-samba4/lib/all.sh
get_available_s4connector_dc

Stop the connector on this server with

/etc/init.d/univention-s4-connector stop

Step four: Clone the Samba 4.7 database of the affected UCS DC into the Samba 4.10 index format via network.

Next login to the “good” new DC Backup or DC Slave which is already running Samba 4.10 and run the following commands. We’ll use the names “affecteddc” and “410dc” as names below.

## The following patch is reasonable for this migration to keep the password hashes as they are in the Samba database
## If you don't do this, then the password hashes will be stored on disk with an extra layer of encryption. That's totally
## transparent to the operation of Samba and a per server choice. Starting with Samba 4.8 the extra layer of encrpytion is
## activated by default for new installations and joins. The patch just shows how to keep the former behavior. Is should
## not matter unless you are hosting your Samba backend databases in a remote outsourced datacenter.

sed -i 's/    ctx.do_join()/    ctx.plaintext_secrets=True\n    ctx.do_join()/' /usr/lib/python2.7/dist-packages/samba/join.py 

samba-tool drs clone-dc-database "$(dnsdomainname)" --server=affecteddc -UAdministrator --targetdir /var/tmp/affecteddc --include-secrets

sed -i 's/    ctx.plaintext_secrets=True//' /usr/lib/python2.7/dist-packages/samba/join.py

## Then copy the cloned database to the affected UCS server
rsync -a /var/tmp/affecteddc/ affecteddc:/var/tmp/affecteddc/

Step five: Login to the affected UCS Samba 4.7 and replace the old database by the fresh clone data:

ucr set dns/backend='ldap'; /etc/init.d/bind9 restart
/etc/init.d/samba stop

ridsetldif=$(ldbsearch -H /var/lib/samba/private/sam.ldb -b  "CN=RID Set,CN=$(hostname),OU=Domain Controllers,$(ucr get connector/s4/ldap/base)")

nextrid=$(sed -n 's/^rIDNextRID: //p' <<<"$ridsetldif")
prevpool=$(sed -n 's/^rIDPreviousAllocationPool: //p' <<<"$ridsetldif")

## Important: Backup the Samba private data directory:
cp -a /var/lib/samba/private/ /var/lib/samba/private.backup

## Optional: compactify the cloned ldb files
cd /var/tmp/affecteddc/private/sam.ldb.d/
for f in *.ldb; do tdbbackup "$f" && ls -l "$f"*; mv "$f".bak "$f"; done

rsync -a /var/tmp/affecteddc/private/sam.ldb \
              /var/lib/samba/private/

rsync -a /var/tmp/affecteddc/private/sam.ldb.d/*.ldb \
              /var/lib/samba/private/sam.ldb.d/

if [ -f /var/tmp/affecteddc/private/encrypted_secrets.key ]; then
  rsync -a /var/tmp/affecteddc/private/encrypted_secrets.key /var/lib/samba/private/
fi

## Next update samba to make ldbedit work against the new Database format:
univention-upgrade --updateto=4.3-4 --disable-app-updates
/etc/init.d/samba stop

##  Finally restore the local RID Set state:
echo -e "dn: CN=RID Set,CN=$(hostname),OU=Domain Controllers,$(ucr get connector/s4/ldap/base)\nchangetype: modify\nreplace: rIDNextRID\nrIDNextRID: $nextrid\n-\nreplace: rIDPreviousAllocationPool\nrIDPreviousAllocationPool: $prevpool" | ldbmodify -H /var/lib/samba/private/sam.ldb

/etc/init.d/samba start

Step six: Start the S4-Connector and the DNS-Server again:

/etc/init.d/univention-s4-connector start
ucr set dns/backend='samba4'; /etc/init.d/bind9 restart

Now repeat steps one to six for all other UCS Samba/AD DC servers in your UCS domain.

After that all the UCS Samba/AD servers should be good and running Samba 4.10 with the reindexed database format.

Extra section for UCS@school Slave PDCs:

For UCS@school Slave PDCs you may either follow the same steps as for the servers above, or you can re-join the server.