About Split Brain and Detection Script
In Pluribus parlance, a cluster has two switches (nodes) that operate as a single logical switch. These switches periodically as well as on an event driven basis, exchange control messages over the control network to keep the status and tables (such as L2 tables, vLAG, STP, cluster states, etc) between the two switches synchronized. If one of the nodes in the cluster fails to communicate with the peer node for three consecutive cluster-sync messages, then the node sets the cluster to offline mode and attempts to function in an independent mode. If one of the cluster nodes is down, then operating as an independent node helps to maintain continuity.
However, if both nodes are up and the nodes are unable to sync-up (for example, due to cluster network going down), then both nodes operating in independent mode is not desirable. This situation can lead to duplicate packets such as broadcast-unknown unicast or multicast (BUM) traffic or traffic loss. This condition is known as split brain.
To mitigate the split brain condition, Netvisor ONE provides two scenarios:
- Handling of Split Brain by disabling STP (starting with Netvisor ONE version 7.0.1)
- Handling of Split Brain using Detection Script (in Netvisor ONE version 6.1.1 HF4)
Handling of Split Brain by Disabling STP
Starting from Netvisor ONE version 7.0.1 onward, the split brain handling capability of Netvisor has been enhanced by allowing seamless ingress traffic on the orphan ports. That is, during a split brain condition, when STP is disabled on the nodes of a layer 3 fabric setup:
- All ports on both nodes of the cluster remain "up" even when a cluster node goes offline while STP is disabled. However, the flood traffic (towards vLAG ports) gets dropped after crossing over cluster because of the egress filtering rules.
- For ingress traffic on orphan ports, traffic entering the peer switch gets forwarded through the cluster port (assuming L2 table has the entry and is not aged out) and traffic on orphan ports remain active.
- When the cluster node comes back online, the cluster node re-synchronizes without having to reboot the switch.
System Behavior with STP enabled
When STP is enabled, if the control network goes down (due to disabling or any network issue), then the cluster port goes to discarding state (use the stp-state-show command to confirm) because the cluster STP sync messages cannot be exchanged. In other words, in-band connectivity is lost and data traffic cannot flow over the cluster link. This behavior applies even if the control network is configured as mgmt. Since the node reachability over both in-band and management is down, the state in fabric-node-show command displays as offline. See example below:
CLI (network-admin@switch) > fabric-node-show
fab-name mgmt-ip in-band-ip in-band-vlan-type fab-tid cluster-tid out-port state
-------- ------------- ---------- ----------------- -------- ---------- -------- ------
fab1 10.13.48.59/24 1.1.1.1/24 public 49 5 272 offline
CLI (network-admin@switch*) > stp-state-show ports 272
vlan: 1,4093,4095
ports: none
instance-id: 0
name: stg-default
bridge-id: 66:0e:94:5e:fb:b2
bridge-priority: 32768
root-id: 66:0e:94:5e:fb:b2
root-priority: 32768
root-port: 0
hello-time: 2
forwarding-delay: 15
max-age: 20
disabled: none
learning: none
forwarding: none
discarding: 272 >>>>>>>>>> discarding
edge: none
designated: 272
Note: If management network IP is changed to make it unreachable for unicast fabric messages when the connectivity is available for fabric multicast messages, then the state in fabric-node-show command oscillates between mgmt-only-online and offline states.
Handling of Split Brain Using Detection Script
If you are using Netvisor ONE version 6.1.1 HF1, you must use the script provided as part of Netvisor package for the detection and recovery of split brain condition.
As a pre-requisite, you must install split brain detection script as a service on both nodes of the cluster pair. Netvisor ONE supports control network over Management or over in-band IP. The control network can be set to mgmt or in-band by using the fabric-local-modify control-network [in-band|mgmt] command.
Note: Starting with Netvisor ONE version 6.1.1 HF4, the split brain install/uninstall scripts are available in /opt/nvOS/bin/pn-scripts directory. Having the scripts in /opt/nvOS/bin/pn-scripts directory enables you to invoke the scripts from both the Netvisor ONE CLI and REST API shell prompt.
When the script is installed on leaf switches:
- The script detects the split brain condition based on the following factors:
- When the control network is over management, losing the management network connection results in split brain.
- When the control network is over in-band, cluster links going down results in split brain
- Once the cause is detected, the script proceeds to quarantine the cluster slave (backup) switch. That is, all ports on the slave switch are brought down except the cluster ports. The method employed by split brain script to disable the ports is not persistent with switch-reboot, hence no manual intervention to enable ports is required.
To install and run the script as a system service, run the pn_split_brain_install.sh command by using Netvisor ONE CLI. The service will persist a switch reboot. It is not required to re-run the script on every reboot or power-cycle. To stop the script from running, use the pn_split_brain_uninstall.sh command.
The script can also detect when the cluster is back online.
Below is an example on how to use the script. To install and run the script, use the command:
CLI (network-admin@switch*) > pn-script-run name pn_split_brain_install.sh
Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_install.sh:
Created symlink from /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service to /etc/systemd/system/svc-nvOS-split-brain.service.
To know or detect the status of the script, use the command:
CLI (network-admin@switch*) > exit
root@switch*:~#
root@switch*:~# systemctl status svc-nvOS-split-brain.service
● svc-nvOS-split-brain.service - Service to check for split brain functionality in cluster-slave and disable ports
Loaded: loaded (/etc/systemd/system/svc-nvOS-split-brain.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2021-11-17 02:27:24 PST; 7s ago
Main PID: 25780 (perl)
Tasks: 2
Memory: 12.8M
CPU: 1.697s
CGroup: /system.slice/svc-nvOS-split-brain.service
├─25780 /usr/bin/perl /usr/bin/pn_split_brain.pl
Dec 06 19:39:18 switch* systemd[1]: Started Service to check for split brain functionality in cluster-slave and disable ports.
To stop the script from running, use the pn_split_brain_uninstall.sh command as below:
CLI (network-admin@switch*) > pn-script-run name pn_split_brain_uninstall.sh
Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_uninstall.sh:
Removed symlink /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service.
To know or detect the status of the script, use the command:
CLI (network-admin@switch*) > exit
root@switch*:~#
root@switch*:~# systemctl status svc-nvOS-split-brain.service
● svc-nvOS-split-brain.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
To install the script on all switches of the fabric, use the command:
CLI (network-admin@switch*) > switch * pn-script-run name pn_split_brain_install.sh
To uninstall the script on all switches of the fabric, use the command:
CLI (network-admin@switch*) > switch * pn-script-run name pn_split_brain_uninstall.sh
Guidelines and Limitations
The following guidelines/limitations apply during split brain condition:
- The allow-offline-cluster-nodes command option in the transaction-settings-modify command is turned OFF by default and further fabric and cluster scoped transactions are not allowed until the cluster node is back online.
- Quarantine of cluster slave (backup) node mitigates the traffic loss due to split brain issue.
- In Fabric over Layer 3 to detect split brain, you should have 'allow-as' enabled (so that in-band-ip of cluster master is reachable via spine switch) or the management link should be present.
Split Brain Script Status Verify Command
A new command, pn-service-status-show, is introduced in Netvsior ONE 6.1.1 HF5 to verify the status of split brain script. An example of how to check the status on both the CLI and vREST API prompt is provided below:
When the split-brain script is not installed:
CLI (network-admin@plu007switch*) > pn-service-status-show
name status
----------- -------
split-brain offline
CLI (network-admin@plu007switch*) >
pluribus@plu-srv001:~$
pluribus@plu-srv001:~$ curl -s -u network-admin:test123 -X GET http://plu007swl01/vRest/pn-services -k | python -m json.tool
{
"data": [
{
"name": "split-brain",
"status": "offline"
}
],
"result": {
"result": [
{
"api.switch-name": "local",
"code": 0,
"message": "",
"scope": "local",
"status": "Success"
}
],
"status": "Success"
}
}
pluribus@plu-srv001:~$
When the split-brain script is installed:
CLI (network-admin@plu007switch*) > pn-script-run name pn_split_brain_install.sh
Executing /opt/nvOS/bin/pn-scripts/pn_split_brain_install.sh:
Created symlink from /etc/systemd/system/multi-user.target.wants/svc-nvOS-split-brain.service to /etc/systemd/system/svc-nvOS-split-brain.service.
CLI (network-admin@plu007switch*) > pn-service-status-show
name status
----------- ------------------------------------------------------------------
split-brain Active: active (running) since Wed 2022-02-16 00:45:17 PST; 5s ago
CLI (network-admin@plu007switch*) >
pluribus@plu-srv001:~$ curl -s -u network-admin:test123 -X GET http://plu007swl01/vRest/pn-services -k | python -m json.tool
{
"data": [
{
"name": "split-brain",
"status": "Active: active (running) since Wed 2022-02-16 00:45:17 PST; 19s ago\n"
}
],
"result": {
"result": [
{
"api.switch-name": "local",
"code": 0,
"message": "",
"scope": "local",
"status": "Success"
}
],
"status": "Success"
}
}
pluribus@plu-srv001:~$
Note: When split brain script is installed or uninstalled on a switch, a message along with time-stamp is printed in nvOSd.log and perror.log file. See example below:
2022-02-13,20:04:41.358.: split-brain service is installed
2022-02-14,20:05:21.680.: split-brain service is uninstalled
Recovery handling
To recover from split brain, the quarantined switch should be brought back to active service after the cluster comes back online. The script detects when the cluster is back online and proceeds to reboot the prior cluster slave switch (which had detected the split brain condition and was quarantined) and bring it back to active service with the ports enabled.
To check the status of ports, use the command:
CLI (network-admin@switch) > cluster-bringup-show
Where the status displays ports-enabled indicating that both devices of cluster are now active and up for service.