Network Health with VM and Clusters

Network health is as much of proactive planning as it is maintenance. When looking at an infrastructure configured with VM’s and clusters, several types of network traffic will flow through the physical network and network adapters. Also, network health is built upon reducing single points of failure. Using NIC teaming for non-redundant like client connectivity, using a secondary NIC,  having a third NIC card as a backup hardware preconfigured and disabled until needed, and using multiple switches are all points to build network redundancy. Reconnecting or redirecting the network is another point to consider with the network’s health. This goes into DR planning and testing.

Network health traffic planning includes traffic for network management, cluster network traffic, live migration, storage and replication. Network management traffic can be isolated with an IPSec Tunnel and is used for running the VMs, client-server access, AS, DNS, WSUS, and other OS management services.  Cluster network traffic monitors the nodes to the data store. It communicates a regular check to determine the health of the node and if the node is down, the cluster network can redirect the I/O to another cluster node which may access the data store. Live migrations can saturate network links, hence designated networks in a cluster using a VLAN to retain QOS is beneficial to the overall network’s health. Isolation of SMB storage traffic using iSCSI targets so the network connectivity remains reliable is important for VMs on a storage network. It is not possible to place Replica on a dedicated network like Live migration because it discovers and uses whatever network is available to transmit the replication information. However, throttling the bandwidth in QOS is one way to preserve the network’s health. Another way is to provide a static route from one NIC to another along with controlling the bandwidth in QOS.    

Powershell has a multitude of commandlets which can be used to configure clusters, manage network traffic, and preform many other administration tasks. Some of them used in the above tasks or configurations are:

  • (get-cluster). securitylevel = 2 sets the intra-node communication to encrypted
  • (get-cluster). samesubnetdelay = 2 will increase the number of intra-node heartbeats that are sent in a cluster network health’s check.
  • (get-clusternetwork “cluster network 1”) .role=3  will enable the cluster network 1 to be enabled for client and cluster communication if a default gateway is present.
  • set-smbserverconfiguration –enablemultichannel $false disables SMB multi-channel