Introducing the Generic Load-Balancer Kit for SIOS LifeKeeper and Microsoft Azure

SIOS Background
Reading Time: 6 minutes

In this blog, I will discuss the Generic Load-Balancer Application Recovery Kit (ARK) for SIOS LifeKeeper for Linux and specifically how to configure it on Microsoft Azure. I will use a two node NFS cluster and the NFS exports they provide will ultimately be accessed via the load-balancer.

SIOS has created this ARK to facilitate  client redirection in LifeKeeper clusters running in Azure..

Since Azure does not support gratuitous ARP, clients cannot connect directly to traditional cluster virtual IP addresses. Instead, clients must connect to a load balancer and the load balancer redirects traffic to the active cluster node. . Azure implements a load-balancer solution that operates on layer 4 (TCP, UDP), the Load-Balancer can be configured to have private or public frontend IP(s), a health-probe that can determine which node is active, a series of backend IP addresses (for each node in a cluster) and incoming/outgoing network traffic rules.

Traditionally the health probe would monitor the active port on an application and determine which node that application is active on, The SIOS generic load-balancer ARK is configured to have the active node listen on a user defined  port. This port is then configured in the Azure Load-balancer as the Health Probe Port. This allows the active cluster node to respond to TCP health check probes, enabling automatic client redirection..

Installation and configuration in Azure is straightforward and detailed below:

Within the Azure portal, select load-balancing

Create a load-balancer, you will select the resource group where you want this to be deployed as well as the name, I like to use a name that lines up with the cluster type that I’m using the load balancer with for example IMA-NFS-LB will sit in front of both IMA-NFS nodes.

You can determine whether this will be a public or private LB. In this case I’m configuring a private Load-Balancer to front my NFS server for use only within this resource group.

Once you determine what the name, resource group etc. will be then you will be asked to assign a Name, Virtual network, subnet and an IP for the load-balancer. The IP address should be the same IP address that you will create in LifeKeeper as a virtual IP address.

Once the basic information is entered for the load-balancer you will need to define which machines are to be configured in the backend to serve the load-balancer, in my case this backend pool will consist of the two nodes that I’m using for my NFS server.

You will require a load-balancing rule, this is how the load-balancer will determine what traffic to route to the active node. The port number configured here will be used in SPS-L when you configure the generic application to support the Load-balancer. In this example we are using “HA Ports”, which will route all traffic to the active node. If you want to limit what traffic to route you can specify a specific application port.

The frontend IP should be the load-balancer IP, the backend pool should be the nodes that you configured to be the resources used by the load-balancer. Ensure that the “HA Ports” button is selected and “Floated IP” is enabled. “TCP reset” can be left disabled.

When you create the health probe, make sure you note the port that you configure here as it will be used when we create the generic application within the SIOS Protection Suite. You can use the standard values for “interval” and “Unhealthy threshold”. These can be changed at a later time if you have application specific requirements.

Now the load balancing rule should be complete with a health probe. Select “Add”

Once we select “add” then Azure will start the deployment of the Load Balancer, this can take several minutes and once complete then the configuration moves on to the SIOS Protection Suite.

NOTE: Once the backend machines are configured behind a load-balancer they will lose access to the internet gateway so things like system updates will not work. You can remove the machines from the backend resource group to allow internet access again.

Configuration with SIOS Protection Suite

For this blog I have configured three NFS exports to be protected using SPS-L, the three exports are configured to use the same IP as the Azure load balancer’s frontend IP. I’m using DataKeeper to replicate the data stored on the exports.

First step is to obtain the scripts, the simplest way is to use wget but you can also download the entire package and upload the rpm directly to the nodes using winscp or a similar tool. You need to install the Hotfix on all nodes in the LifeKeeper cluster.

The entire recovery kit can be obtained here: http://ftp.us.sios.com/pickup/LifeKeeper_Linux_Core_en_9.5.1/patches/Gen-LB-PL-7172-9.5.1

The parts can be found here with wget:

wget http://ftp.us.sios.com/pickup/Gen-LB-PL-7172-9.5.1/steeleye-lkHOTFIX-Gen-LB-PL-7172-9.5.1-7154.x86_64.rpm

wget http://ftp.us.sios.com/pickup/Gen-LB-PL-7172-9.5.1/steeleye-lkHOTFIX-Gen-LB-PL-7172-9.5.1-7154.x86_64.rpm.md5sum

wget http://ftp.us.sios.com/pickup/Gen-LB-PL-7172-9.5.1/Gen-LB-readme.txt

Once downloaded, verify the MD5 sum against the value recorded on the FTP site.

Install the RPM as follows:

rpm -ivh steeleye-lkHOTFIX-Gen-LB-PL-7172-9.5.1-7154.x86_64.rpm

Check that the install was successful by running:

rpm -qa | grep steeleye-lkHOTFIX-Gen-LB-PL-7172

Should you need to remove the RPM for some reason, this can be done by running:

rpm -e steeleye-lkHOTFIX-Gen-LB-PL-7172-9.5.1-7154.x86_64

Below is the GUI showing my three NFS exports that I’ve already configured:

What we need to do within the SIOS Protection Suite is define the Load Balancer using the Hotfix scripts provided by SIOS.

First we create a new resource hierarchy, we select Generic Application from the drop down.

Define the restore.pl script located in /opt/LifeKeeper/SIOS_Hotfixes/Gen-LB-PL-7172/

Define the remove.pl script located in /opt/Lifekeeper/SIOS_Hotfixes/Gen-LB-PL-7172/

Define the quickCheck script located in /opt/Lifekeeper/SIOS_Hotfixes/Gen-LB-PL-7172/

There is no local recovery script, so make sure you clear this input

When asked for Application Info, we want to enter the same port number as we configured in the Health Probe configuration e.g. 54321

We will choose to bring the service into service once it’s created.

Resource Tag is the name that we will see displayed in the SPS-L GUI, I like to use something that makes it easy to identify.

If everything is configured correctly you will see “END successful restore”, we can then extend this to the other node so that the resource can be hosted on either node.

This shows the completed Load Balancer configuration following extension to both nodes.

The last step for this cluster is to create child dependencies for the three NFS exports, this means that all the NFS exports complete with DataKeeper mirrors and IPs will rely on the Load Balancer. Should a serious issue occur on the active node then all these resources will fail-over to the other functioning node.

Above, the completed hierarchy in the LifeKeeper GUI. Below shows the expanded GUI view showing the NFS exports, IP, Filesystems and DataKeeper replicated volumes as children of the Load Balancer resource.

This is just one example of how you can use SIOS LifeKeeper in Azure to protect a simple NFS cluster. The same concept applies to any business critical application you need to protect. You simply need to leverage the Load Balancer ARK supplied by SIOS to allow the Azure Load Balancer (Internal or External) to determine which node is currently hosting the application.


Recent Posts

Step-by-Step – SQL Server 2019 Failover Cluster Instance (FCI) in OCI

Introduction If you are deploying business-critical applications in Oracle Cloud Infrastructure (OCI), it’s crucial to understand and leverage the availability SLA (Service Level […]

Read More

Four tips for choosing the right high availability solution

High Availability and Lebron is the Greatest Of All Time (G.O.A.T) Debate I was losing at Spades.  I was losing at Kahoot.  I […]

Read More

Disaster Recovery Solutions: How to Handle “Recommendations” Versus “Requirements”

Let’s say you experience an issue in your cloud cluster environment, and you have to contact one of your application vendors to get […]

Read More