Application Intelligence in the context of High Availability (HA) refers to the system’s ability to understand and respond intelligently to the behavior and health of applications in real time to maintain continuous service availability.
What is Application Intelligence?
So, what is Application Intelligence? Application intelligence involves monitoring, analyzing, and reacting to several factors. These can include application state, like whether the application is up or down? Performance metrics include response time, error rates, throughput, and memory usage. Application dependencies, such as databases or external services. Lastly, they look at user behavior or patterns. Using Application Intelligence takes a more holistic view of the application. It uses various data points to make educated decisions about the state of the application itself, not just the infrastructure. Let’s take the example of a web server; it’s not simply enough to know if the server is running, but is the site accessible without any errors? Is the response slow at all? Are users refreshing multiple times and trying to access it? Is the database the website relies on also up and running and accessible? All the above are examples of the factors that application intelligence considers to be successful.
How LifeKeeper Uses Application Intelligence
So, how does LifeKeeper use application intelligence to enhance high availability for critical applications? Let’s break it down. LifeKeeper uses application-specific recovery kits (ARKs) that contain knowledge for each application (SAP, SQL, PostgreSQL, Oracle, etc.). This allows LifeKeeper to handle the startup/shutdown procedures of each application, monitor the health and status of both the application and any dependencies, as well as orchestrate intelligent failover/failback operations without corrupting any data. Users can group together related resources in a hierarchical relationship within LifeKeeper, which allows LifeKeeper to understand the dependencies between different application components (when a service relies on an IP or database, for example). This ensures LifeKeeper failovers happen in the correct order and recovery actions don’t break the application or leave it in an inconsistent or broken state.
Additionally, LifeKeeper does deep health checks, not just determining if the server is up, but also more detailed checks, such as whether a database is accepting connections or if a web service is returning expected responses. It can even monitor if certain expected background processes are running. LifeKeeper also uses application-specific configuration files to ensure data configuration consistency across nodes and that application settings are preserved or restored correctly. Lastly, LifeKeeper has the ability to use custom scripts to further fine-tune these deep checks to support less common or homegrown applications intelligently as well.
PostgreSQL ARK: A Real-World Example of Application Intelligence
To take a deeper dive, we can look at how PostgreSQL ARK uses Application Intelligence. The PostgreSQL ARK uses specific logic to monitor, start, stop, and failover PostgreSQL via knowledge of the specific PostgreSQL startup and shutdown commands, awareness of critical config files like postgresql.conf and pg_hba.conf and understanding the data directory layout and lock file behavior.
Intelligent Monitoring and Ordered Failover for PostgreSQL
Additionally, it doesn’t just check that PostgreSQL is running, it also checks if the database is responding to queries, the correct data directory is accessible, and if there is any corruption in the transaction logs? It uses dependency tracking to make sure that the resources PostgreSQL often depends on are available such as the Virtual IP for client connections and the mounted storage for its data directory. This ensures that LifeKeeper can bring up the resources in the correct order in case of a failover, such as mounting the disk first, bringing up the IP, and then starting PostgreSQL before verifying the service health.
Preventing Split-Brain and Ensuring Data Integrity
Lastly, LifeKeeper uses application intelligence to avoid split-brain (a phenomenon where more than one node thinks it’s the ‘primary’ node) scenarios by avoiding starting two active PostgreSQL servers with the same data directory and avoiding data corruption by not failing over when writes are still in progress. These are examples of all the different ways LifeKeeper and the various ARKs have implemented application intelligence to make the combined product as resilient as possible.
Strengthen Application Resilience with Intelligent High Availability
In summary, LifeKeeper’s built-in application intelligence enables precise, fast, and reliable failover and recovery by understanding how applications behave and what they need to run correctly.
Ensure application resilience and uninterrupted service—request a demo or start your free trial today to experience how SIOS LifeKeeper uses application intelligence to protect your critical workloads.
Author: Cassy Hendricks-Sinke, Principal Software Engineer, Team Lead