Troubleshooting the Fabric
Rolling Back and Rolling Forward Transactions
You can display the list of executed transactions with the transaction-show command. For example, the committed transaction 20 corresponds to a command where Netvisor ONE has disabled port 12:
CLI network-admin@switch > transaction-show
start-time: 10:05:22
end-time: 10:05:22
scope:        local
tid:          20
state:        commit
command:      port-config-modify port 12 disable
undo-command: port-config-modify port 12 enable
The Netvisor ONE transaction log file, xact_config.log, contains the list of commands corresponding to the executed transactions as well as the undo command(s) required to undo each transaction.
The list of undo commands can be used to revert back to a previous state, that is, to roll back one or more transactions starting from the latest one.
On the other hand, the list of executed commands can be used to redo certain transactions, in other words to roll forward one or more transactions.
Corresponding commands exist to perform these actions. Netvisor uses the transaction-rollback-to command to roll the entire fabric back to an earlier fabric transaction number. Netvisor uses the transaction-rollforward-to command to roll the entire fabric forward to a subsequent fabric transaction number.
For instance, the fabric state gets accidentally out of sync according to the fabric-node-show command output with a missing interface addition transaction:
CLI network-admin@switch > fabric-node-show format name,fab-name,fab-tid,state,device-state,
name fab-name fab-tid state device-state
--------- ---------- ------- ------ ------------
pnswitch2 pnfabric 1 online ok
pnswitch1 pnfabric 2 online ok
You can roll back the state to a previously synchronized ID to restore fabric-wide, scope fabric:
CLI network-admin@switch > transaction-rollback-to scope fabric tid 1
Warning: rolled back transactions are unrecoverable unless another fabric node has them. Proceed? [y/n] y
After successfully rolling back the transaction and no error message prints on the console, the change completes and Netvisor removes the transaction from the transaction log.
You can also roll the state forward to attempt to successfully redo the previously failed fabric-wide interface addition. If auto-recover is enabled by default, then rolling the state forward happens automatically:
CLI network-admin@switch > transaction-rollforward-to scope fabric tid 2
Added interface eth2.13
After successfully rolling forward the transaction, Netvisor completes the change and updates the transaction log.
If multiple nodes go out of sync, you must recover each node separately.
If you apply a configuration to the fabric, and a node does not respond to it, in certain circumstances, you may want to evict the node from the fabric to troubleshoot the problem on the specific device.
To evict a node, the node must be offline, otherwise the eviction command fails. Then you can use the fabric-node-evict command to perform the eviction process:
CLI network-admin@switch > fabric-node-evict name pnswitch2
CLI network-admin@switch > fabric-node-evict id b000021:52a1b620
The first example uses the switch name for identification, and the second uses the switch ID.