Zero Touch Replacement of Network Switches
Hardware failures happen. You can plan for them as best as you can, but eventually, they will happen. Administrators must then review backups of devices, make sure that they have the latest, verify that any changes that happened since the backup and before the failure are documented and have a plan for the remediation of this failure.
Verity from BE Networks makes this daunting task very fast and simple. It’s so easy in fact, that we call it Zero Touch Replacement, or ZTR. It is identical to Zero Touch Provisioning (ZTP) but instead of provisioning the device as a brand-new device, we treat it like a brand-new device, and automatically will push the configuration to the device so that the environment is back up and running normally in hours instead of days or weeks. This blog will detail how to use this capability using Verity.
Zero Touch Replacement
So, you get a call from the NOC, and they say that Leaf2 is reporting down, and when they try to connect to it, it doesn’t respond. Also, monitoring saying loss of communications. You walk into the Data Center and sure enough, Leaf2 is dead, no power, no lights, slight smell of ozone, you know that the switch has met its end. You call the vendor to let them know you need to RMA a switch and send out a replacement. Luckily, you are a smart network administrator, and you have a spare device ready to go, but need a replacement to refill your supply. You pull the new switch out of storage and begin disconnect the cables, labeling them what port they connect in to, and then remove the switch from the rack. You then install the new switch in the rack and re-connect the cables to the correct interfaces. You plug in the power and everything powers up. You then walk back to your desk and login to Verity. Now comes the easy part.
Marking the failed switch out of service
First thing that needs to be done for ZTR to do its magic is to mark the failed device out of service. To do this, from the Topology view, go to the failed switch and click the “Mark out of Service” Button:
When you click this button, a dialog box appears letting the operator know that this is a disruptive action and that any devices connected downstream from this switch will be impacted.
Luckily, this a leaf pair in an MC-LAG configuration, so by taking the device out of service, it will automatically clear all traffic from this device to the other and make it primary. Since the device has already failed, this has already happened, so select “Yes.”
Updating the Device Controller to communicate with the new Device
Now, you will notice that Leaf2 is now a darker shade of grey. This lets you and other operators using Verity know that this device is out of service.
Now, to add the new switch in its place, we need to update the Device controller with the new device’s unique identifier, such as Service Tag or Serial Number, depending on the platform.
Go to the Administration Dashboard, and then select VNFs tab, and zoom in to the Device Controller that manages the device you are replacing:
Double click the Device Controller to zoom in to it:
Click the Edit button to edit the Controller details. Where it says LLDP Search String, update with the new Service Tag or Serial number of the device based on the vendor, and then click the check to save it
Now we need to put the switch back in service to start the ZTR process.
Taking Switch out of MOS and automatically configuring the device
Now we need to go back to the switch and put it back in service. This means that the switch will go through the ZTP process of installing SONiC and then configuring the device using the managed Leaf2 configuration, including all the end points and the underlay settings. To do this, click the MOS button:
The following dialog box appears letting you know that you are marking the device back in service.
Click yes. The Device will turn Green letting you know that it is now configuring the device and that the pairs are reconfiguring:
After about 30 minutes, the switch will install SONiC, configure the underlay automatically, and then configure the endpoints based on the saved configuration in Verity for the device. All you had to do was click a couple of buttons and update the controller with the new ID tag for Verity to know how to find the device and manage it.
That’s it, the most complicated part of this whole process is making sure it’s cabled correctly, but you are a great admin and did that already. 😊
Luke Williams
Jefe de producto
With over 30 years experience in the IT field, there is not a whole lot Lucas hasn’t seen. From running a local ISP in his home town in Iowa when he was 16 to managing networks and server administration at the second largest newspaper media company in the United States in 2012, to developing IoT solutions and networking NOS's while working at Canonical, he is constantly learning and trying new technologies to keep up with markets and company demands for new technologies to meet their needs. He currently is the Product Lead at BE Networks specializing in Open Networking and SONiC.