9 Signs You Have an Application Availability Problem

HA Problem
Reading Time: 2 minutes

You’ve heard the saying “recognizing a problem is the first step in solving it.”  But, many small, medium, and surprisingly, even large enterprise businesses aren’t aware that their application availability isn’t what it should be. 

Nine signs that you still have an application availability problem:

1. You spend more time restarting an application than using it
Application crashes may be a fact of life, but if your application is down more often than it is up, that is a problem.

2. You’ve started to snooze through the alert storm in your inbox or control center
You have deployed alerts for application or server downtime, but the alert storm has so overwhelmed your inbox that you have silenced them all.

3. You have one data center for all your critical operations
A single data center for operations may sound convenient, but one well intended but misdirected construction crew has been known to turn single data centers into costly unavailability zones.

4. Your idea of data protection involves backup retrieval and archives
Your data protection strategy is critical. Data replication technology and site to site, region to region replication has become a mainstay, so if your replication or data protection strategy is non-existent or involves a lengthy jog to the vault this could be a big problem.

5. Your recovery procedures always require manual intervention
Manual intervention itself is not a problem. Some events are so difficult and complex that some amount of manual effort could be required.  But, if manual intervention is always the first, second and third order of business after a server or application outage, that is a problem.

6. Your RTO is measured in days not hours or minutes
How are you measuring your recovery time objective (RTO)? Do you measure your RTO in days or hours instead of minutes per month?  True, every business has a tolerance level for their RTO.  However, your RTO should not be a function of server rebuilds and gross instabilities in your architecture.

7. You don’t know your RPO because your standby is never reliably in sync
You’ve checked the box on reliable monitoring and recovery of your application, and taken it a step further to provide a standby cluster ready system.  Great job.  But, before I let you off the hook, what is your recovery point objective (RPO)? An RPO should be something more accurate than “somewhere between day 0 and last night.”

8. Single points of failure don’t just exist, they are the norm
Where are your single points of failure?  Your budget may not allow you to eliminate every single point of failure, but if you can identify a single point of failure in every major category and every critical component of your enterprise.

9. Your last disaster made local, regional, or national news 
If the last major storm, grid failure, or failure event put a blight on your business due to downtime, then higher availability is the next order of business.

Downtime costs your business in terms of customers, productivity, and peace of mind.  Unaddressed risks have a definite impact on your business and reputation.  If these warning signings are there, you may have an availability problem.  And, if you ignore them you’ll likely have even bigger problems soon thereafter.

— Cassius Rhue, VP, Customer Experience


Recent Posts

Choosing Between GenApp and QSP: Tailoring High Availability for Your Critical Applications

GenApp or QSP? Both solutions are supported by LifeKeeper and help protect against downtime for critical applications, but understanding the nuances between these […]

Read More

What Causes Failovers to Happen?

Working in support, one of the most common questions we get from customers is “What prompted the failover from my primary node to […]

Read More

Step-by-Step – SQL Server 2019 Failover Cluster Instance (FCI) in OCI

Introduction If you are deploying business-critical applications in Oracle Cloud Infrastructure (OCI), it’s crucial to understand and leverage the availability SLA (Service Level […]

Read More