About Split Brain and Detection Script



In Pluribus parlance, a cluster has two switches (nodes) that operate as a single logical switch. These switches periodically as well as on an event driven basis, exchange control messages over the control network to keep the status and tables (such as L2 tables, vLAG, STP, cluster states, etc)  between the two switches synchronized. If one of the nodes in the cluster fails to communicate with the peer node for three consecutive cluster-sync messages, then the node sets the cluster to offline mode and attempts to function in an independent mode. If one of the cluster nodes is down, then operating as an independent node helps to maintain continuity. 


However, if both nodes are up and the nodes are unable to sync-up (for example, due to cluster network going down), then both nodes operating in independent mode is not desirable. This situation can lead to duplicate packets such as broadcast-unknown unicast or multicast (BUM) traffic or traffic loss. This condition is known as split brain. 


To mitigate the split brain condition, Netvisor ONE provides two scenarios:


  1. Handling of Split Brain by disabling STP (starting with Netvisor ONE version 7.0.1)
  2. Handling of Split Brain using Detection Script (in Netvisor ONE version 6.1.1 HF4)


Handling of Split Brain by Disabling STP


Starting from Netvisor ONE version 7.0.1 onward, the split brain handling capability of Netvisor has been enhanced by allowing seamless ingress traffic on the orphan ports. That is, during a split brain condition, when STP is disabled on the nodes of a layer 3 fabric setup:


  • All ports on both nodes of the cluster remain "up" even when a cluster node goes offline while STP is disabled. However, the flood traffic (towards vLAG ports) gets dropped after crossing over cluster because of the egress filtering rules. 
  • For ingress traffic on orphan ports, traffic entering the peer switch gets forwarded through the cluster port (assuming L2 table has the entry and is not aged out) and traffic on orphan ports remain active. 
  • When the cluster node comes back online, the cluster node re-synchronizes without having to reboot the switch.


System Behavior with STP enabled


When STP is enabled, if the control network goes down (due to disabling or any network issue), then the cluster port goes to discarding state (use the stp-state-show command to confirm)  because the cluster STP sync messages cannot be exchanged. In other words, in-band connectivity is lost and data traffic cannot flow over the cluster link. This behavior applies even if the control network is configured as mgmt. Since the node reachability over both in-band and management is down, the state in fabric-node-show command displays as offline. See example below:


CLI (network-admin@switch) > fabric-node-show


fab-name mgmt-ip       in-band-ip in-band-vlan-type fab-tid cluster-tid out-port state

-------- ------------- ---------- ----------------- -------- ---------- -------- ------

fab1    10.13.48.59/24 1.1.1.1/24  public            49       5          272     offline


CLI (network-admin@switch*) > stp-state-show ports 272


vlan: 1,4093,4095

ports: none

instance-id: 0

name: stg-default

bridge-id: 66:0e:94:5e:fb:b2

bridge-priority: 32768

root-id: 66:0e:94:5e:fb:b2

root-priority: 32768

root-port: 0

hello-time: 2

forwarding-delay: 15

max-age: 20

disabled: none

learning: none

forwarding: none

discarding: 272 >>>>>>>>>> discarding 

edge: none

designated: 272


Note: If management network IP is changed to make it unreachable for unicast fabric messages when the connectivity is available for fabric multicast messages, then the state in fabric-node-show command oscillates between mgmt-only-online and offline states.          


Handling of Split Brain Using Detection Script


If you are using Netvisor ONE version 6.1.1 HF1, you must use the script provided as part of Netvisor package for the detection and recovery of split brain condition.  


As a pre-requisite, you must install split brain detection script as a service on both nodes of the cluster pair. Netvisor ONE supports control  network over Management or over in-band IP. The control network can be set to mgmt or in-band by using the fabric-local-modify control-network [in-band|mgmt] command. 


Note: Starting with Netvisor ONE version 6.1.1 HF4, the split brain install/uninstall scripts are available in /opt/nvOS/bin/pn-scripts directory. Having the scripts in  /opt/nvOS/bin/pn-scripts directory enables you to invoke the scripts from both the Netvisor ONE CLI and  REST API shell prompt.


When the script is installed on leaf switches:

  • The script detects the split brain condition based on the following factors:
    • When the control network is over management, losing the management network connection results in split brain.
    • When the control network is over in-band, cluster links going down results in split brain
  • Once the cause is detected, the script  proceeds to quarantine the cluster slave (backup) switch. That is, all ports on the slave switch are brought down except the cluster ports. The method employed by split brain script to disable the ports is not persistent with switch-reboot, hence no manual intervention to enable ports is required.


To install and run the script as a system service, run the pn_split_brain_install.sh command by using Netvisor ONE CLI. The service will persist a switch reboot.  It is not required to re-run the script on every reboot or power-cycle.  To stop the script from running, use the pn_split_brain_uninstall.sh command.


The script can also detect when the cluster is back online.


Below is an example on how to use the script. To install and run the script, use the command:


CLI (network-admin@switch*) > pn-script-run name pn_split_brain_install.sh

Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_install.sh:

Created symlink from /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service to /etc/systemd/system/svc-nvOS-split-brain.service.


To know or detect the status of the script, use the command:


CLI (network-admin@switch*) > exit

root@switch*:~#

root@switch*:~# systemctl status svc-nvOS-split-brain.service

● svc-nvOS-split-brain.service - Service to check for split brain functionality in cluster-slave and disable ports

   Loaded: loaded (/etc/systemd/system/svc-nvOS-split-brain.service; enabled; vendor preset: enabled)

   Active: active (running) since Wed 2021-11-17 02:27:24 PST; 7s ago

 Main PID: 25780 (perl)

    Tasks: 2

   Memory: 12.8M

      CPU: 1.697s

   CGroup: /system.slice/svc-nvOS-split-brain.service

           ├─25780 /usr/bin/perl /usr/bin/pn_split_brain.pl

Dec 06 19:39:18 switch* systemd[1]: Started Service to check for split brain functionality in cluster-slave and disable ports.


To stop the script from running, use the pn_split_brain_uninstall.sh command as below:

 

CLI (network-admin@switch*) > pn-script-run name pn_split_brain_uninstall.sh

Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_uninstall.sh:

Removed symlink /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service.


To know or detect the status of the script, use the command:


CLI (network-admin@switch*) > exit

root@switch*:~#

root@switch*:~# systemctl status svc-nvOS-split-brain.service

● svc-nvOS-split-brain.service

   Loaded: not-found (Reason: No such file or directory)

   Active: inactive (dead)


To install the script on all switches of the fabric, use the command:


CLI (network-admin@switch*) > switch * pn-script-run name pn_split_brain_install.sh


To uninstall the script on all switches of the fabric, use the command:


CLI (network-admin@switch*) > switch * pn-script-run name pn_split_brain_uninstall.sh


Guidelines and Limitations


The following guidelines/limitations apply during split brain condition:

 

  • The allow-offline-cluster-nodes command option in the transaction-settings-modify command is turned OFF by default and further fabric and cluster scoped transactions are not allowed until the cluster node is back online.
  • Quarantine of cluster slave (backup) node mitigates the traffic loss due to split brain issue.
  • In Fabric over Layer 3 to detect split brain, you should have 'allow-as' enabled (so that in-band-ip of cluster master is reachable via spine switch) or the management link should be present.


Split Brain Script Status Verify Command


A new command, pn-service-status-show, is introduced in Netvsior ONE 6.1.1 HF5 to verify the status of split brain script. An example of how to check the status on both the CLI and vREST API prompt  is provided below:


When the split-brain script is not installed:


CLI (network-admin@plu007switch*) > pn-service-status-show

name        status

----------- -------

split-brain offline


CLI (network-admin@plu007switch*) >

pluribus@plu-srv001:~$

pluribus@plu-srv001:~$ curl  -s -u network-admin:test123 -X GET http://plu007swl01/vRest/pn-services -k  | python -m json.tool

{

    "data": [

        {

            "name": "split-brain",

            "status": "offline"

        }

    ],

    "result": {

        "result": [

            {

                "api.switch-name": "local",

                "code": 0,

                "message": "",

                "scope": "local",

                "status": "Success"

            }

        ],

        "status": "Success"

    }

}

pluribus@plu-srv001:~$


When the split-brain script is installed:


CLI (network-admin@plu007switch*) > pn-script-run name pn_split_brain_install.sh

Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_install.sh:

Created symlink from /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service to /etc/systemd/system/svc-nvOS-split-brain.service.


CLI (network-admin@plu007switch*) > pn-service-status-show

name        status

----------- ------------------------------------------------------------------

split-brain Active: active (running) since Wed 2022-02-16 00:45:17 PST; 5s ago


CLI (network-admin@plu007switch*) >


pluribus@plu-srv001:~$ curl  -s -u network-admin:test123 -X GET http://plu007swl01/vRest/pn-services -k  | python -m json.tool

{

    "data": [

        {

            "name": "split-brain",

            "status": "Active: active (running) since Wed 2022-02-16 00:45:17 PST; 19s ago\n"

        }

    ],

    "result": {

        "result": [

            {

                "api.switch-name": "local",

                "code": 0,

                "message": "",

                "scope": "local",

                "status": "Success"

            }

        ],

        "status": "Success"

    }

}

pluribus@plu-srv001:~$


Note: When split brain script is installed or uninstalled on a switch, a message along with time-stamp is printed in nvOSd.log and perror.log file. See example below:

2022-02-13,20:04:41.358.: split-brain service is installed
2022-02-14,20:05:21.680.: split-brain service is uninstalled 


Recovery handling


To recover from split brain, the quarantined switch should be brought back to active service after the cluster comes back online. The script detects when the cluster is back online and proceeds to reboot the prior cluster slave switch (which had detected the split brain condition and was quarantined) and bring it back to active service with the ports enabled.


To check the status of ports, use the command:


CLI (network-admin@switch) > cluster-bringup-show


Where the status displays ports-enabled indicating that both devices of cluster are now active and up for service.


north
    keyboard_arrow_up
    keyboard_arrow_down
    description
    print
    feedback
    support
    business
    rss_feed
    south