Loose connectivity to UCS after adding a secondary ethernet interface

System:

  • UCS 4.4-6
  • ESXi / vcenter 6.7

Problem:

  • After adding a secondary network interface to UCS, and performing a reboot, UCS is unable to communicate with any networks, and is unreachable.

Process:

  • Added a secondary interface via vCenter.
  • Went to UCS web interface > network settings> added eth1, and configured and IP, and applied. This works and I am able to ping the system on the new interface.
  • Reboot the server
  • Server is unpingable, when logging in via vmware console, Unable to ping the default gateway of the original interface, any ip addresses on the new interface / subnet. Can only ping local host and local interfaces.

Trouble shooting performed:

  • Tried process on a main server, a backup server.
  • Tried to perform a traceroute from vmware console after the reboot. The only line that appears is the UCS itself then stops.
  • checked ip route, it looks right. I see the default route going through the original interface.
  • Checked /etc/network/interfaces, I see the correct entries, and networkctl shows the same.
  • Tried disabling the firewall, this did nothing.
  • Tried multiple reboots
  • Tried reviewing documentation, but there is not much. Other posts on this forum state you can just add the interface and assign an IP.
  • Tried adding a secondary interface with no ip at all, also did not configure the interface in UCS. After a reboot, same thing happens.
  • Tried upgrading to the latest 4.4-7 and all packages.
    – Nothing works, same problem after all trouble shooting.

Workaround

  • After a reboot and loosing connectivity to the server, while the server is in this unreachable state powered on, if I disable the newly added interface in vmware, the server comes back to life from the original interface. If I then reenable the secondary interface in vmware interface, the server is now pingable from the original interface and secondary interface. Now I am able to ping both interfaces again. However after a reboot again, loose connectivity to the server.

Actual Goal:

  • We need to set up out of band management of UCS’s SSH and Apache2 web service. These services must be on a different interface, while all other UCS services can remain on eth0.

What is going on here, am I missing a step?

After hours of trouble shooting this, I figured out a ACTUAL solution / possible reason for this problem.

What is causing this problem?
All signs pointed to a network adapter drivers issue. When I checked the UCS vm’s network adapter settings in vcenter, I found that the UCS OVA deploys as type “Flexible”. This adapter type was causing the issue the whole time.
It seems this type is an older driver used in older versions of VMware. More info here (https://kb.vmware.com/s/article/1001805). This type is not available in vcenter 6.7. I assume that it worked fine by itself but did not play nice with other adapters being added to the vm. Types avaialble in 6.7 are E1000, VMXNET2, and VMXNET3. Even if the secondary adapter type was using the ones I just listed, failure would happen after reboot.

The solution

  1. Before even adding a new interface, shutdown UCS completely.
  2. Edit settings for the ucs guest in vcenter / esxi
  3. Remove the current network adapter that is “Adapter type: Flexible” (Make note of the port network it is using)
  4. Add a new network adapter, (This will be your new Network adapter 1 interface) this time drop down “Network adapter”, and choose “Adapter type” to be “VMXNET 3”. Set to same port network your original adapter was using.
  5. Add a second network adapter (This will be your Network adapter 2 interface) again drop down the “Network adapter”, and choose “Adapter type” to be “VMXNET 3”. Set port network to secondary interface for Out of band management (OOBM).

Now you have two adapters that are VMXNET3 type. note, by default “direct IO” is enabled on a network adapter when added as vmxnet3. I disabled this on both.

  1. Save settings, and power on UCS guest. It should come back up with the same IP with no problems.
  2. Login to UCS as an admin, go to Settings -> Network settings, and add the new eth1 adapter, type ethernet, apply IP accordingly for your OOBM interface. Select finish.
  3. Back on the Network settings page, select Apply changes. Once its complete select “ok” on the red warning at the bottom of the web interface that will appear.
  4. At this point you should be able to ping both interfaces, test to make sure.
  5. If both are pingable, Reboot the server.

Now the VM will power up find after reboot and both interfaces will work properly.

Tested on

  • Tested with 4.4-6 and 4.4-7. I did not upgrade VMware tools for either.

Recommendation for UCS team
Not sure the reasoning to why the Flexible adapter type is used. Might make sense to upgrade this to VMXNET3 going forward. VMXNET3 has been around for a while now, most newer operating systems play nice with it.

Mastodon