Seven Skills That Your Team Needs if You are Going with Open Source High Availability

7 Skills for HA
Reading Time: 4 minutes

In the realm of High Availability (HA) there are certain important skills your team needs if you decide to go the route of open source. Open source by definition denotes software that is freely available to use.
Today, there are numerous commercial implementations of high availability clusters for many operating systems provided by vendors like Microsoft and SIOS Technology Corp.  These commercial solutions provide resource monitoring, dependency management, failover and cluster policies, and some form of management prepackaged and priced.  An alternative to commercial implementations are several open source options that also give companies the opportunity to provide high availability for their enterprise.
As companies continue to look for optimizations, cost savings, and potential tighter control, a growing number of companies and customers are also considering moving to open source availability solutions.
Here are seven skills that your team may need for a move to Open Source HA:

1. Coding skills

In many cases the lack of pre-packaged and bundled support for enterprise applications means that your team will need to be able to develop solutions to protect components, fix issues with bundled components, or write application connectors to ensure application awareness is properly handled.  Lots of people can write scripts, but your team will need to know how to create and adhere to sound development practices and standards.  The basics of this include things such as:

  • Design and Architecture Requirements
  • Design Reviews
  • Code / Code Reviews and Unit Tests (preferably automated)

2. Knowledge of the technology environment

Many enterprise applications require integration with multiple systems in order to provide high availability that meets the Service Level Agreements (SLA) and Service Level Objectives (SLO).  Your team will require deep application awareness and knowledge of the technology environment to build protection and solutions for this integration with multiple enterprise systems.  You need people who know the ins and outs of the critical applications, the technology environment for those applications, networking, hardware, hypervisors, and an understanding of the environmental and application dependencies.  You’ll also need team members who understand the architecture, features, and limitations of the set of HA technologies that you intend to use from the Open Source community. Consider how much of these areas your team knows and understands:

  • Data passing and node communication
  • Node failure
  • Application management
  • System recovery and restart
  • Logging and messages
  • Data resilience and protection

3. Business process knowledge

You need someone to understand your business requirements, and the business process.  Your team needs professionals who understand the enterprise’s business and the processes that drive it.  Your team will need to know and understand how much budget is available to spend for developing the solution, how much risk the business is willing to take, and how to gather additional requirements that may be unspoken or unspecified.
The team will also need to know, or to hire someone who knows how to convert those business requirements into software requirements and how to manage a process that brings a minimum viable high availability solution to fruition that meets the needs of the business, the speed of the business, and fits within the processes of the business.

4. Experience with OS, Applications and Infrastructure

If you are looking to go all open, your team will need experience understanding Operating Systems, Applications and Infrastructure.  You’ll need to understand the various OS release cycles, including kernel versions for Linux, updates and hotfixes for Windows.  You have applications in house that need to be supported, but you’ll need to also be diligent to understand the application update cycle, their dependencies, and the intersection of applications and OS support matrices.  If your environment is homogeneous, great.  Otherwise, your team will need to know the differences between RHEL, RHEL derivatives, and SUSE.  If you are both Linux and Windows you’ll need to know these as well.  You’ll also need to understand the difference that the infrastructure will make on the application and OS combination.  AWS and Azure present differences for high availability that differs from GCP, on-premise, and other hypervisors.

5. Change management capabilities

Imagine that you have the development team to create the solution, with technical and business knowledge along with a firm grasp of the OS, Infrastructure and Applications.  But, getting the scripts together is just the beginning.  Your team will also need change management capabilities.  How will your team keep track of the code changes and versions, packages, and package locations?  How will your team manage the releases of updates and changes?  Your team will need to be versed in a source repository, such as git, project management tools, such as Jira, and release train proficiency.  You’ll need a team that understands how to make updates to code, deliver patches and fixes, all while avoiding unwanted impact.

6. Data analytics and troubleshooting experience

When you enter the space of delivering your own HA solution your team will need analytics and troubleshooting experience.  You’ll need to have resources who understand the intersection of application code, system messages, and application error logs and trace files.  When a system crash occurs, you’ll have to dig deeper into the logs to troubleshoot and find the root cause, analyze the data to make recommendations, and be prepare to roll out changes (see #5 above).  Don’t forget, your team will also need to know and understand what the data from these logs and trace files can tell you about the health of your environment even when there isn’t an error, failure or system crash.

7. Connections (Dev, QA, Partners, Community)

Let’s be honest, your business isn’t about delivering high availability, but if you decide to dive into the realm of open source HA you are going to need more help than just the brilliance on your team.  Key to getting that additional help will be understanding where to start and then making the right connections to community developers, persons who are experts on testing, HA and application partners, and the open source community.  Open forums have been really helpful, but you’ll need to double check if the response times are compliant with your SLAs and SLOs.
Using Open Source solutions is an option that many companies choose to pursue for cost concerns and a perception of flexibility, lower cost, and less risk.  But, buyer beware, there may be hidden costs in the form of new skills and management, and hidden risks in terms of the open source programs you use that will be needed for any “roll your own HA solution.”
– Cassius Rhue, VP, Customer Experience


Recent Posts

Step-by-Step – SQL Server 2019 Failover Cluster Instance (FCI) in OCI

Introduction If you are deploying business-critical applications in Oracle Cloud Infrastructure (OCI), it’s crucial to understand and leverage the availability SLA (Service Level […]

Read More

Four tips for choosing the right high availability solution

High Availability and Lebron is the Greatest Of All Time (G.O.A.T) Debate I was losing at Spades.  I was losing at Kahoot.  I […]

Read More

Disaster Recovery Solutions: How to Handle “Recommendations” Versus “Requirements”

Let’s say you experience an issue in your cloud cluster environment, and you have to contact one of your application vendors to get […]

Read More