Configuring a Cluster

If you have two Netvisor OS switches, and want them to work together to provide networking services in the event one of the switches fails, the switches must be members of the same fabric, and you must configure them as a cluster.

You configure one node as a primary node and the other node as a secondary. This reference is asymmetric, as these assignments do not change unless you explicitly configure them differently.

When you create a cluster configuration, you specify the nodes as cluster-node-1 and cluster-node-2. These assignments do not change unless you explicitly change them.

Cluster-node-1 is the primary node and cluster-node-2 is the secondary node. These roles are used to add asymmetry to some protocols. This reference is asymmetric.

A cluster-link contains the port or ports directly connecting the two cluster nodes together. If there are more than one port, this refers to the trunk (LAG) of those ports.

VLAN 4094 is a reserved VLAN used for cluster synchronization traffic. It is added to the in-band interface port and cluster-link automatically when you create the cluster configuration.

Netvisor OS detects cluster-links using an extra data set send in LLDP messages. When a cluster-link is detected, VLAN 4094 is automatically added to it.

Netvisor OS performs cluster synchronization over the control network of the fabric. For the in-band interface, synchronization uses the clust4094 vNIC on VLAN 4094 over the cluster-links. For management, this is performed on the management interface.

Cluster synchronization uses keep-alive messages to detect if the peer cluster node is online. Cluster synchronization messages contain the following information:

The state synchronization is online or offline state of the cluster. Additionally, version numbers are exchanged so messages are adjusted to ensure backward compatibility. Each cluster node sends synch messages to the other node every 2 seconds. If a node misses three synchronization messages in a row, the cluster goes offline. When a node comes online, it triggers the following behavior:

To set up a cluster of two switches, pleiades4 and pleiades6, you must verify that they are members of the existing fabric:

CLI network-admin@switch > fabric-node-show layout vertical

id:                     184549641

name:                   pleiades4

fab-name:               corp-fab

fab-id:                 b000109:5695af4f

cluster-id:             0:0

local-mac:              3a:7f:b1:43:8a:0f

fabric-network:         in-band

control-network:        in-band

mgmt-ip:                10.9.19.203/16

mgmt-mac:               ec:f4:bb:fe:06:20

mgmt-secondary-macs:    

in-band-ip:             192.168.168.203/24

in-band-mac:            3a:7f:b1:43:8a:0f

in-band-vlan:           0

in-band-secondary-macs:

fab-tid:                1

cluster-tid:            0

out-port:               0

version:                2.4.204009451,#47~14.04.1-Ubuntu

state:                  online

firmware-upgrade:       not-required

device-state:           ok

ports:                  104

 

To create a cluster configuration, use the following command:

CLI network-admin@switch > cluster-create name cluster1 cluster-node-1 onvlnvOS-switch1 cluster-node-2 onvlnvOS-switch2

To verify the status of the cluster, use the cluster-show command:

CLI network-admin@switch > cluster-show

name        state      cluster-node-1     cluster-node-2

---------   -------    ---------------    --------------

cluster1    online     corp-switch1   corp-switch2

 

To replace a failed cluster node, use the cluster-repeer command. However, you must evict the failed node from the fabric, and then run the cluster-repeer command on an active node after replacing the failed node.

To display information about the cluster, use the cluster-info command:

CLI network-admin@switch > cluster-info format all layout vertical

name:           cluster-leaf

state:          online

cluster-node-1: draco01

cluster-node-2: draco02

tid:            2

mode:           master

ports:          69-71,129

remote-ports:   69-71,128

If you want to connect the cluster nodes to an uplink switch, you must configure a VLAG between the ports on the cluster nodes and the uplink switch. For example, if corp-switch1 has port 53 connected to the uplink switch and corp-switch2 has port 19 connected to the uplink switch, create a VLAG by executing the vlag-create command on either of the switches:

CLI network-admin@switch > VLAG-create name VLAG-uplink local-port 53 peer-switch switch1 peer-port 19

This example assumes that you’ve entered the command on switch2.

To verify the configuration, use the following command:

CLI network-admin@switch > vlag-show

name     cluster     mode          switch     port        peer-switch

-------- ----------- ------------- ---------- ----        ---------

switch-1       cluster-2   active-active spine-1     34         spine-2

 

peer-port      status    local-state lacp-mode

-------------- --------- ----------- ---------

129            normal    enabled     active


 

Informational Note:  Before you can create a VLAG, you must configure the two switches in a cluster.

More Information on vLAGs

Netvisor OS uses vLAG synchronization to coordinate Active-Standby and Active-Active vLAGs using the following rules:

Active-Standby vLAGs

Active-Active vLAGs

Modifying LACP Mode on an Existing VLAG Configuration

You can modify the LACP mode on an existing VLAG configuration:

CLI network-admin@switch > vlag-modify

name name-string

Specify the VLAG name. 

failover-move-L2|
failover-ignore-L2

If you specify the parameter, failover-move-L2, Netvisor sends gratuitous ARPs.

lacp-mode off|passive|active

Specify the LACP mode as off, passive or active.

lacp-timeout slow|fast

Specify the LACP timeout as slow (30 seconds) or fast (4 seconds).

lacp-fallback bundle|individual

Specify the LACP fallback mode as individual or bundled.

lacp-fallback-timeout seconds

Specify the LACP fallback timeout in seconds. The default is 50

seconds