Replace a Failed Cluster Server
Data Server Node Replacement
One symptom of a failed Data Server Node is the appearance of offline nodes in the UNUM System Health dashboard as shown in the example below.
In the example, UNUM displays single ESXi instance with 4 data nodes, all offline.
UNUM System Health Dashboard - Cluster
In the event of a Cluster Server failure and you have received a replacement Server from Pluribus Networks please use the following instructions to rebuild the Cluster.
Note: The replacement Server you receive has VMware ESXi installed. You need to add the Server to the Cluster using the cluster_menu.sh configuration script.
Note: The replacement Server Node must be connected via the Eth0 Ethernet interface. Specify Static IP address when using static IPs otherwise DHCP settings are used.
1.Login into the Remote Console of a Primary VM instance with your login credential. If you have not changed the default credentials the username and password is “vcf” and the password is “changeme”. The UNUM Cluster setup script is named “unum_provision.sh” and is located in the default folder “/home/vcf/srv/vcf/bin/tools/cluster”.
2.Run the setup script: ./unum_provision.sh
UNUM Cluster Menu -Setup Script
3.Select Option 2 - Manage Cluster from the deployment menu.
UNUM Cluster Menu - Manage Cluster
4.Select Option 5 - Node Management - from the setup menu.
UNUM Cluster Menu - Node Management
5.Select Option 2 - Replace Server - from Node Management.
UNUM Cluster Menu - Node Management - Replace Server
6.Follow the on-screen instructions. Enter the IP address of the VMWare ESXi Primary Node. In the event of a Primary Server Node failure you use the IP address of a Data Server Node. However, the instructions for replacing a Primary Server Node server differ slightly. Refer to Primary Server Node replacement for more instructions.
UNUM Cluster Menu - Primary Server Node IP Address
7.Download the applicable Cluster OVA Template from the Pluribus Cloud. The downloaded OVA version must be the same version as previously installed. Enter the absolute path of the OVA template. Enter Shift U and then press the Tab key on your keyboard. The downloaded OVA template name will be displayed. Press Enter to continue. For the VM Port Group Name press Enter and use the default AutoCluster.
UNUM Cluster Menu - OVA Template Path - VM Port Group Name
8.Provisioning of the replacement Server begins.
UNUM Cluster Menu - Replacement Server Provisioning
When you replace a Data Node Server auto-provisioning starts and details appear as the process continues.
The auto-provisioning process typically begins within 10 minutes and provisions the new Data Node Server.
UNUM Cluster Menu - Replacement Server Provisioning Details
UNUM will restart and NTP details for each new Data Server Node are displayed along with a summary message indicating Cluster Provisioning passed.
9.Press any key to continue and you return to the configuration menu. Press 0 (zero) to exit.
At any time during the provisioning process you can review the status of the Data Server Nodes in the UNUM System Health dashboard.
Note: For each Data Server Node there is an an Eth1 IP Address entry and you may observe two entries per IP Address, one Offline and one Online. This is a normal and expected condition and is temporary until the next automatic data refresh is performed by UNUM as shown in the images below. This should normally occur with 20 - 25 minutes.
UNUM Cluster Menu - Replacement Server Offline / Online
UNUM Cluster Menu - Replacement Server Online
Primary Server Node Replacement
Follow the instructions provided above for Data Server Node replacement, however you will login to an existing Data Server Node.
Note: When the new Primary Server Node is inserted into the Cluster with already provisioned Data Server Nodes and their respective IP addresses match, the Cluster will form.
You must run a “Restore Configuration” from the “UNUM_setup.sh” script located on the new Primary Server Node in the “/home/vcf” directory to restore previously stored data and configuration. On an UNUM Primary Server Node data is automatically backed up on a daily basis.
Select Option 8: Advanced Settings - Restore Configuration
Select Option 2 to restore your configuration.
Select the desired backup file from the list of Available Backups and follow the on-screen instructions.
Note: UNUM will be restarted during the process.
Option 2 - Advanced Settings Restore Configuration
Option 2 - Advanced Settings Restore Process
When the Data Server Node (with data node VMs) is inserted into the Cluster with Primary Server Node and Data Server Node and the IP address matches the previous IP Address the auto provisioning begins and the Cluster will eventually form.