Managing RMAs for Switches
A primary case for an RMA is a failed switch in the network. The configuration can be restored to a replacement switch using the following commands:
- fabric-join repeer-to-cluster-node
RMA Process for Version 2.6.0 and Later
Netvisor One fabric objects such as vRouters, VLAGs, clusters,and others are created on a switch in the fabric. Netvisor One tracks the switch using a location field, which is currently the host ID of the switch where the fabric objects are configured.
This presents various issues when replacing a faulty switch with a new switch and a new host ID. Fabric-wide configurations that reference the old host ID requires updating to the new host ID. These updates require manual extra steps and are either confusing, or it isn’t clear what commands need execution.
Netvisor ONE changes the location from a host ID to a fabric-specific location id assigned to each switch as the switch joins the fabric. Netvisor ONE keeps the same ID during the RMA process and reduces the RMA process to a single command.
Netvisor ONE supports a new parameter, location-id, which is unique among the fabric nodes. Each node that joins the fabric is assigned a new location ID when it joins.
All configurations that require a location is tied to the location ID instead of the host ID.
When Netvisor One executes the command, switch-config-import, the location ID is inherited from the imported configuration. Therefore, no updates required across the fabric because all configurations refer to the correct location ID.
The following commands are no longer necessary to restore an imported configuration on a new switch:
A new parameter, location-id, is added to the commands, node-info and fabric-node-show output. This displays the location of the node.
A new command, fabric-node-location-mappings, displays the current fabric host ID to the location ID mappings. This is used as input for the command, switch-config-import, when importing configurations from earlier versions of software.
If you are importing a configuration from an earlier version of software, use the following syntax:
CLI (network-admin@Spine1) > switch-config-import upgrade-location-mappings
If the imported configuration already has location IDs, Netvisor ONE ignores the parameter.
RMA Process for Versions 2.5.4 and earlier
This procedure assumes that a failed switch is part of a HA pair (cluster). Nodes that are part of a cluster automatically back up each other configuration.
For an RMA case, the host id differs between the new switch and the old failed switch. Both cluster membership and service object locations are tied to the host id.
1) Retrieve the host id of the old node:
CLI> fabric-node-show name <old-hostname> format name,id
2) Evict the old node from the fabric. This allows to process fabric provisioning operations before the RMA is complete. Additionally, the presence of the old node ID interferes with subsequent steps.
CLI> fabric-node-evict name <old-hostname>
3) Setup the new switch with basic settings, like hostname and IP address. Perform this step at the console when the switch is booted for the first time and can be modified:
4) Configure the new switch to rejoin the fabric. As it is part of a cluster, use the repeer-to-cluster-node option.
CLI> fabric-join name <fabric-name> repeer-to-cluster-node <existing-peer-name>
This downloads the entire backed up configuration from the cluster peer and restarts Netvisor One to apply it. This restores local, cluster, and fabric scoped configuration.
5) After restart, any service objects that were present on the failed switch, must be migrated to the new host. Use the value retrieved in Step 1 for the location parameter:
CLI> object-location-modify location <old-hostid> new-location <new-hostname>
The above command executes a bulk migration of all service objects (vRouters, vNET managers, OVSDB Interfaces) and sub-objects.