There are many places to start with regarding building the plan, strategy, design, and architecture for a highly available cluster. Of course, wise builders want to understand the basic requirements: two nodes or three, RTO under 10 minutes or under five minutes, RPO of near zero or absolutely zero. Architects also want to understand how many nodes, and how the hardware and network can be made resilient. Will you deploy in the datacenter, on the cloud, or a mixture of both? In addition to understanding the architecture of the underlying hardware, requirements gathering and design also lead to gaining an understanding of the critical applications, the High Availability (HA) software, processes, and governance procedures that will need to be followed, additional dashboards and integrations required for reporting, monitoring, and alert distribution. All team members will also want to understand the basics of recovery and failover orchestration, of course.
Why Company and Solution Provider History Matters in High Availability
But one thing that often gets overlooked in the deployment of High Availability is company history. Of course, if you are going to entrust your enterprise environment to a monitoring, alerting, recovery, and failover orchestration solution, you’d want to know and understand who they are, what they do, and how long they’ve been doing it well. Is this a new startup company located in Buford, Wyoming, a company available in the US only, or a global company that happens to have an HA offering in mothballs that are only trotted out to close other parts of a deal?
As you build your architecture, of course, you need to know that the HA company knows, understands, and does HA well. But, and this is a big one, the most important history your team needs to know when architecting your HA solution isn’t theirs but yours.
As VP of Customer Experience, I’ve worked with numerous customers, teams, architects, and solution integration teams on deploying HA solutions across on-premise and off. In many of these discussions, one overlooked factor in deploying a sound infrastructure and HA architecture is the history of the company itself. So, why does your company, or the company you are architecting HA for, matter? Five (5) ways in which company history should impact your HA architecture
Five Ways Company History Shapes HA Architecture
Here are Five (5) ways in which company history should impact your HA architecture:
1. Company Size (Too big or too small)
What is your company’s history with regards to the HA team? Does your company have too many people on the team, with conflicting or overlapping roles and responsibilities? Or does your company have a team that is undersized, even as it overachieves? Depending on the history of your company and its size over that time, you may need to make adjustments in your design for additional authentication, more granular permissions and restrictions, etc. If your team is small, perhaps adding the burden of developing and maintaining a free solution would be too much of a burden. If your team is large, with many roles and overlaps, and time to develop custom solutions, consider if a commercial solution would be a better fit to free those resources up for new development, additional improvements, or even greater efficiency in the day-to-day operations.
2. Company Life Cycle (Every five years or not until it breaks)
What is your company’s life cycle history? Does your CIO/CTO revamp your entire infrastructure on a fixed cycle, or are they more of an “If it ain’t broke, don’t fix it,” type? If your company has a long history of trading out and replacing solutions and providers, then your architecture will need to be more robust to handle the swapping in and out of components and pieces. In this case, your HA architecture will also need to factor in offboarding, end of life, and onboarding of a potentially new solution within a short period of time. A key for this type of high turnover will be to limit custom work and hard dependencies.
On the other hand, if your HA solution will be in place for ten years or more, you’ll want to make sure that your vendors provide maintenance and extended support for the critical components within your infrastructure. Your architecture will also need to heavily weigh the challenges that might be encountered with various software solutions and interoperability as the solution ages past the standard support lifecycle, and how to mitigate those risks.
3. Company Staffing (The revolving door or the lone ranger)
As VP of Customer Experience, one of my most shocking memories was working with a company to architect a solution for HA. Within one week of the go-live date, the project manager for that team announced that he and his whole team had been terminated. The go-live would be transferred to a new team, both new to the company and new to HA. As I would later learn, company Z had a revolving door policy with IT and the administrators for their HA environment. Most, if not all, of their resources were contractors. If your company has a history of high turnover, then your architecture and design must include a runbook, and the process and procedures for maintenance need to also include training; formal product training, procedural testing, administration training, and chaos scenarios.
The revolving door isn’t the only company staffing history to be aware of. The Lone Ranger is another scenario that is critical to know and understand. At SIOS, our team joined a bewildered project manager looking for any answers and information regarding their enterprise systems, both involving SIOS and beyond. The Lone Ranger had left the company due to unspecified reasons, and upon their departure, new members of the team discovered that a lot of tacit knowledge was undocumented and unaccounted for in any documents they could find. When designing and building your architecture, knowing the type of staffing and history of staffing can help you design solutions properly, and may lead your team to choose a solution that is commercially available and staffed with services for the unfortunate Lone Ranger departures.
4. Company Past Disasters
Company disasters and downtime are another historical point that needs to be understood well by designers of HA solutions. Typically, company disasters make their way into future architecture designs as requirements. The past disasters, including their root cause, risk mitigation strategies, detection, prevention, and reporting recommendations, are often added to the deck of initial requirements. However, digging into the history of the disasters may uncover more requirements and factors that need to be accounted for. As VP of Customer Experience, our team learned a tremendous amount of data for building a better experience for several of our clients by understanding the company’s disasters. In one instance, unattended VM maintenance was a big part of the company’s strategy, but also a source of many company availability issues. While working with architects, our services team not only addressed application availability but helped the design team account for backup and recovery, maintenance and upgrades, and rollback strategies that maintain availability in the event of an automated failure.
5. Company Culture
As VP of Customer Experience, our team works closely with customers and partners who are passionate about application availability, adhering to the most stringent of Service Level Agreements (SLA) and Service Level Objectives. As we worked with these teams, their designs and architecture specifications reflected a company culture that considered availability (architecture, design, hardware, networking, applications, cluster software, people, and process) as an indispensable part of their business. Sadly, not all companies have this type of company culture. Knowing the history of your company’s culture will definitely shape the way you implement HA, bringing out the best in design and architecture, either for adherence to the culture or as a method to improve culture and business success.
Don’t Overlook the Role of Company History in HA Decisions
Yes, the company history of the datacenter or cloud provider is important. Knowing the history of Lou’s Low Cost Cloud, LLC (no offense to Lou), which has been hemorrhaging equipment, while running from the mostly un-air-conditioned garage of Lou’s parents’ home, is important if you were considering Lou for your datacenter. Yes, the company history of the application and the HA vendor is also important. Knowing the history of your ERP, Database, and frontend application provider is key to assessing and mitigating risks, understanding deployment patterns and methodology, and gaining confidence that timely fixes, updates, security, and support will be a cornerstone of your architecture. But, do not underestimate the importance of knowing your own company history and how the critical failures should shape your new and ongoing HA decisions and infrastructure.
Ready to strengthen your HA architecture with proven expertise? Request a demo today and see how SIOS can help you design and deploy a high availability solution built for your company’s unique history and future needs.
Author: Cassius Rhue, VP, Customer Experience