Four tips for choosing the right high availability solution

Reading Time: 4 minutes

High Availability and Lebron is the Greatest Of All Time (G.O.A.T) Debate

I was losing at Spades.  I was losing at Kahoot.  I was losing at a game of basketball, and all to the same friendly competitor, Brandon.  So, to distract him I went back to my go to debate- “Lebron is the greatest of all time!”  The next tension filled minutes were filled with back and forth rants tinged with the names of some of basketball’s greats: Michael Jordan, Julius Erving, Wilt Chamberlain, Bob Cousy, Shaq, Bill Russell, Jerry West, Steph Curry, Kevin Durant, Kobe Bryant, Magic and Worthy, and Lebron.  He jousted with, “How can you even say Lebron is the greatest, Kobe had a killer instinct!”  Our verbal sparring would expand to what are the requirements, what makes someone a part of the conversation of greatness, or even a candidate for part of the discussion.  Do they need longevity, scoring records, defensive prowess, other accolades and honors?  How many Most Valuable Player awards should they have as a minimum?  What about the transcendence of their era?  What about this or that, and of course my friend Brandon is always quick to throw in titles!

How to Choose the Best High Availability Solution

But, what does this have to do with High Availability?  Glad you asked.  How often have you been asked to provide or choose the best availability or higher availability solution from a sea of contenders?  You’ve decided that the last weekend ruined by an unplanned application crash or down production server was the last weekend that will be ruined by a lack of automated monitoring and recovery.  But, which solution is best among the great names like: Microsoft Failover Clustering, SuSE High Availability Extensions, PaceMaker, NEC ClusterPro, vWare HA, SIOS Protection Suite, and SIOS AppKeeper?  Four things I learned in sparring over the Greatest Of All Time that will help you with your high and higher availability quandary.

The Requirements for HA

First, what are the requirements?  If I wanted the best pure shooter of all time, I’d easily and readily include Steph Curry.  If I wanted the most intimidating physical presence, I’m going with someone like Shaq.  If I need the best teammate, assist leader, or all around great then I think Lebron James, Magic Johnson, Jerry West, Larry Bird are in the conversation.  Likewise, before you start spinning up an HA solution, understand what you need.  Is data replication essential or optional?  Do you need SQL or are you equally inclined to use other databases?  What other applications and packages are necessary?  Do you need a solution that can usher you into the cloud, but first it has to tame legacy, vmWare, and physical systems?  Will you be an all Windows application shop, or a mixture?  Try to think of your team as well.  Do you have high turnover that makes management of multiple solutions difficult, training courses essential, and real live people in support critical?  Do you need ease of use or just heavy on robustness?  Where does longevity and stability of the offering, product, and company fit? 

Second, how are you prioritizing your requirements?  How will you prioritize the greats against the established requirements?  My friend Brandon is always quick to throw in titles.  He always counters, how many titles does Lebron have?  Titles are king in his debate.  I typically, and sarcastically counter with stating that even the 12th man on the bench gets a ring.  I highlight the fact that Robert Horry, an outstanding power forward, has more titles than Lebron and MJ.  Have frank and honest conversations about the priority of the requirements.  As you pick an HA solution, how important is ease of use, OS support, and application breadth of support as compared to RTO/RPO?  What features and requirements are considered a must-have, should have, and are nice to have.  As the VP of Customer Experience, we once encountered a customer who insisted that the cluster software supports 32 nodes, despite the fact that they had no plans to build clusters larger than two or three.  Prioritize the list.

Measuring RPO and RTO for Disaster Recovery

Third, how are you measuring those requirements?  How will you measure the greats against the established requirements? Stats in basketball are fun, informative, and often misleading.  Brandon often reminds me to check how scoring titles were won as often as I taught how many were won.  We often drop barbs about who is better to start or close the game and how to really measure drive, intensity, and a will to win.  Likewise, when you comb through the literature, pour over the proof of concept details, determine and define how you will measure things like RPO and RTO.  Is RTO based on the client reconnect time or the time the application is restarted?  Are you measuring RTO for a failover (server crash) recovery (application crash), manual switchover (administrative action), or all of the above?  If application performance is important to you, what does that measurement look like?  Is it read performance, write performance, or based on the client’s actual or characterized workload?  Think about where benchmarks fit in, or do they?  Also, be honest about what you are comparing the numbers to.  Measuring for faster database query times during normal operation and on recovery is important, but what if the rest of the solution creates lags that are experienced higher in the user experience?  

Evaluating High Availability and Disaster Recovery

Lastly, keep evaluating.  From the time that Julius rocked the baby to sleep on the baseline, to the days when Jordan took off from the freethrow line, to the time when Steph Curry shot a step inside the halfcourt line, the game of basketball has been evolving.  The “Jordan Rules” and “Bad Boy Era” swagger has been replaced with a ruleset that favors and highlights the combination of skill, power and finesse.  Likewise, the landscape of technology is constantly changing.  The solution that made the top ten when Solaris and MP-RAS servers ruled the day, may not have adapted to the nimbleness of Linux, Windows or other variants.  The SAN based solution that harnessed the power of Fiber Channel may be obsolete in the cloud and SANless world.  So, keep evaluating greatness.  Keep monitoring how the solutions in the top ten are moving with the trends, or better yet, still making them.

While my debate with Brandon rages on, and likely generations from now, even our children will not have settled on a winner, you can select the right HA solution to meet your enterprise availability needs.  Contact a SIOS representative to help you understand, prioritize, and measure the SIOS Protection Suites ability to exceed your requirements.


Recent Posts

Choosing Between GenApp and QSP: Tailoring High Availability for Your Critical Applications

GenApp or QSP? Both solutions are supported by LifeKeeper and help protect against downtime for critical applications, but understanding the nuances between these […]

Read More

What Causes Failovers to Happen?

Working in support, one of the most common questions we get from customers is “What prompted the failover from my primary node to […]

Read More

Step-by-Step – SQL Server 2019 Failover Cluster Instance (FCI) in OCI

Introduction If you are deploying business-critical applications in Oracle Cloud Infrastructure (OCI), it’s crucial to understand and leverage the availability SLA (Service Level […]

Read More