Win2016 member server can't a access UCS DC until samba restart

Dear Community, dear Univention Team!

we are facing a strange problem presumably since the upgrade to ucs 4.4.

We have a Windows 2016 Server which is joined as a member server to an UCS Samba4 Domain.

Several times a day, the WinServer seems to lose connection to the DC-Master. If the connection is lost, the Winserver can’t access the AD, while the UCS is still reachable from the winserver by TCP, and DNS is still working.

If the winserver is in that error state the shares on the winserver can’t be accessed (because Domain Users can’t be verified).

“nltest /SC_VERIFY:” on the winserver gives Error 1311 (NO_LOGON_SERVERS).

The error temporary goes away if EITHER the samba service on the UCS is restartet OR the winserver is rebooted.

Rejoining of the winserver does not help.

Unfortunately i am not very experienced with windows servers and i do not really know how to debug.

Both Servers are KVM virtual machines on the same physical Server.

Anyone with some advice for me?
Greetings from gießen,
Gerd

Hi,

indeed a strange issue. I do not know if I can help but I can try.

“NO_LOGON_SERVERS” means what is says. The winserver can not connect to a logon server.
There are several topics which might be related:

  • DNS
    Make sure the winserver can reach the UCS by name. Check ipconfig /all and post it’s output here. It should pointing to the UCS server as DNS server, no other!
    If this is out (output!) check logs on UCS if bind9 reports any errors!
    On the UCS run /usr/share/univention-samba4/scripts/check_essential_samba4_dns_records.sh
  • Firewall
    Is there a router and/or firewall in between the UCS and the winserver? Have you tweaked any firewall rules (either on UCS or winserver)?
  • DHCP
    Is one of these two servers configured to use DHCP as client?

That should do it for the first troubleshooting steps. Let me know your findings.

/CV

Hi Christian!

Thanks for your help! I have checked the points you mentioned, but no error found:
DNS:
The UCS can be reached by name. Even in the error state.

Firewall:
There is no router between them. I disabled UCS Firewall with no effect. The Windows firewall is in default state. Both Servers are in the same subnet.

DHCP
The Windows server is in fact configured as DHCP Client, but the UCS serves a fixed IP to the windows server. The Windows Server uses this IP all right, even in error state.

Gerd

Please re-read my comment and follow it in an detailed way!
Additionally post the output out
univention-ldapsearch "cn=SERVERNAME"

/CV

Hi Christian,

thanks for your help, sorry for not reading exactly :wink:
Here is the Output of the commands. Yours Gerd

here is the output of ipconfig /all:
– snip –

Windows-IP-Konfiguration

   Hostname  . . . . . . . . . . . . : winserver2016
   Prim„res DNS-Suffix . . . . . . . : juwe.local
   Knotentyp . . . . . . . . . . . . : Hybrid
   IP-Routing aktiviert  . . . . . . : Nein
   WINS-Proxy aktiviert  . . . . . . : Nein
   DNS-Suffixsuchliste . . . . . . . : juwe.local

Ethernet-Adapter Ethernet 2:

   Verbindungsspezifisches DNS-Suffix: juwe.local
   Beschreibung. . . . . . . . . . . : Intel(R) PRO/1000 MT Network Connection
   Physische Adresse . . . . . . . . : 7A-F4-90-15-9F-B5
   DHCP aktiviert. . . . . . . . . . : Ja
   Autokonfiguration aktiviert . . . : Ja
   Verbindungslokale IPv6-Adresse  . : fe80::7160:338:7993:787a%5(Bevorzugt) 
   IPv4-Adresse  . . . . . . . . . . : 192.168.0.111(Bevorzugt) 
   Subnetzmaske  . . . . . . . . . . : 255.255.255.0
   Lease erhalten. . . . . . . . . . : Dienstag, 30. Juni 2020 08:57:56
   Lease l„uft ab. . . . . . . . . . : Mittwoch, 1. Juli 2020 20:57:57
   Standardgateway . . . . . . . . . : 192.168.0.254
   DHCP-Server . . . . . . . . . . . : 192.168.0.201
   DHCPv6-IAID . . . . . . . . . . . : 91944080
   DHCPv6-Client-DUID. . . . . . . . : 00-01-00-01-26-82-3D-A2-7A-F4-90-15-9F-B5
   DNS-Server  . . . . . . . . . . . : 192.168.0.201
   NetBIOS ber TCP/IP . . . . . . . : Aktiviert

Tunneladapter isatap.juwe.local:

   Medienstatus. . . . . . . . . . . : Medium getrennt
   Verbindungsspezifisches DNS-Suffix: juwe.local
   Beschreibung. . . . . . . . . . . : Microsoft ISATAP Adapter
   Physische Adresse . . . . . . . . : 00-00-00-00-00-00-00-E0
   DHCP aktiviert. . . . . . . . . . : Nein
   Autokonfiguration aktiviert . . . : Ja

– snap –

Here is the output of
/usr/share/univention-samba4/scripts/check_essential_samba4_dns_records.sh
– snip –

gc._msdcs.juwe.local has address 192.168.0.201
_gc._tcp.juwe.local has SRV record 0 100 3268 server.juwe.local.
_ldap._tcp.gc._msdcs.juwe.local has SRV record 0 100 3268 server.juwe.local.
_ldap._tcp.juwe.local has SRV record 0 100 389 server.juwe.local.
_ldap._tcp.dc._msdcs.juwe.local has SRV record 0 100 389 server.juwe.local.
_ldap._tcp.pdc._msdcs.juwe.local has SRV record 0 100 389 server.juwe.local.
_ldap._tcp.a51dd6b1-c754-4889-8736-7d730c635cef.domains._msdcs.juwe.local has SRV record 0 100 38                                                                                                    9 server.juwe.local.
_kerberos._tcp.dc._msdcs.juwe.local has SRV record 0 100 88 server.juwe.local.
_kerberos._tcp.juwe.local has SRV record 0 100 88 server.juwe.local.
_kerberos._udp.juwe.local has SRV record 0 100 88 server.juwe.local.
_kpasswd._tcp.juwe.local has SRV record 0 100 464 server.juwe.local.
_kpasswd._udp.juwe.local has SRV record 0 100 464 server.juwe.local.
Located DC 'server' in site 'Default-First-Site-Name'
aa878a8f-2f1e-429b-a7a0-4cd6e5e383a7._msdcs.juwe.local is an alias for server.juwe.local.
## Records for site Default-First-Site-Name:
_ldap._tcp.Default-First-Site-Name._sites.juwe.local has SRV record 0 100 389 server.juwe.local.
_ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.juwe.local has SRV record 0 100 389 server.ju                                                                                                    we.local.
_kerberos._tcp.Default-First-Site-Name._sites.juwe.local has SRV record 0 100 88 server.juwe.loca                                                                                                    l.
_kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.juwe.local has SRV record 0 100 88 server                                                                                                    .juwe.local.
## Optional GC Records for site Default-First-Site-Name:
_gc._tcp.Default-First-Site-Name._sites.juwe.local has SRV record 0 100 3268 server.juwe.local.
_ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.juwe.local has SRV record 0 100 3268 server.j                                                                                                    uwe.local.
_kerberos.juwe.local descriptive text "JUWE.LOCAL

– snap –

An here the output of univention-ldapsearch “cn=winserver2016”

– snip –

# extended LDIF
#
# LDAPv3
# base <dc=juwe,dc=local> (default) with scope subtree
# filter: cn=winserver2016
# requesting: ALL
#

# WINSERVER2016, computers, juwe.local
dn: cn=WINSERVER2016,cn=computers,dc=juwe,dc=local
univentionServerRole: windows_client
displayName: WINSERVER2016
cn: WINSERVER2016
krb5PrincipalName: host/WINSERVER2016.juwe.local@JUWE.LOCAL
objectClass: krb5KDCEntry
objectClass: top
objectClass: univentionHost
objectClass: univentionObject
objectClass: sambaSamAccount
objectClass: person
objectClass: shadowAccount
objectClass: univentionWindows
objectClass: krb5Principal
objectClass: posixAccount
loginShell: /bin/false
univentionObjectType: computers/windows
uidNumber: 1839
krb5KDCFlags: 126
sambaAcctFlags: [W          ]
krb5MaxRenew: 604800
sn: WINSERVER2016
homeDirectory: /dev/null
sambaSID: S-1-5-21-1734167691-2428383143-702220222-2941
krb5MaxLife: 86400
uid: WINSERVER2016$
gidNumber: 1005
sambaPrimaryGroupSID: S-1-5-21-1734167691-2428383143-702220222-11011
univentionOperatingSystemVersion: 10.0 (14393)
univentionOperatingSystem: Windows Server 2016 Datacenter
aRecord: 192.168.0.111
univentionNetworkLink: cn=default,cn=networks,dc=juwe,dc=local
macAddress: 7a:f4:90:15:9f:b5
sambaNTPassword: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
krb5Key:: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX==
krb5Key:: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
krb5Key:: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX==
krb5Key:: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
krb5Key:: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
krb5KeyVersionNumber: 2
shadowLastChange: 18439
sambaPwdLastSet: 1593182690

# WINSERVER2016, juwe.local, dhcp, juwe.local
dn: cn=WINSERVER2016,cn=juwe.local,cn=dhcp,dc=juwe,dc=local
objectClass: top
objectClass: univentionObject
objectClass: univentionDhcpHost
univentionObjectType: dhcp/host
cn: WINSERVER2016
univentionDhcpFixedAddress: 192.168.0.111
dhcpHWAddress: ethernet 7a:f4:90:15:9f:b5

# search result
search: 3
result: 0 Success

# numResponses: 3
# numEntries: 2

–snap –

Hi Community, hi Christian.
Do you see anything un-normal? (apart from the fact that the dns-name .local is not recommended, this is historical and did not cause any trouble in the past)

I found out the following:

  1. If the WinServer is in the error state, the netlogon service on the winserver can not be stopped (and therefore not be restartet)
  2. If i restart the samba-ad-dc service on the UCS with “service samba-ad-dc restart”, the netlogon service on the winserver immediately resumes its normal operation, without the need of restarting the winserver or even the netlogon service.

Anyone further ideas to address the problem?

Thanks in advance, Gerd

Hi, after my holiday, i am still looking for a solution.

Any suggestions how i can go on to find the error?

Can i put “service samba-ad-dc restart” in the crontab and run it lets say once an hour to ease the pain or will that have other (bad) consequences?

Yours, Gerd