Date: December 29, 2021
Tags: Cloud, disaster recovery, High AvailabilityReading Time: 4 minutes
In some way or another, the world-changing events of 2020 and 2021 have reshaped nearly everything that we knew, and high availability was no exception. Despite closures and restrictions, many IT teams traded on-prem data centers for the cloud. Many are asking, ‘Now what?’ Here are five things to do to fix your cloud journey in 2022.
Add high availability
In the push to the cloud many IT and business leaders found themselves rushing to move services and applications from data centers that they were closing due to COVID-19 into the cloud. Others rushed to the cloud, not because of data center closures, but to deal with the wave of exploding demand. For some, the journey to the cloud was so fast that HA wasn’t included, and now they’ve discovered the hard way that applications still crash in the cloud and that unexpected outages and unplanned downtime are still the nemesis of AWS, Azure and GCP as much as they were in their previous data center.
The first step in fixing your cloud journey is to add a high availability. This will mean several things to your enterprise:
- Designing and architecting a highly available and redundant architecture
- Choosing software and services that will protect critical components and applications
- Defining and documenting associated processes and procedures, and at least a minimal governance
- Deploying production copies for quality assurance, procedural testing, and chaos testing
Expand for higher availability for disaster recovery
Of course, not everyone made the move to cloud without considering some form of HA. Some IT teams had the foresight to not leave HA on-premises, but in the rush to cloud moved all of their critical servers to the same cloud Availability Zone. While having some HA protections is better than complete vulnerability, if you’ve only deployed your servers and applications in a single Availability Zone (AZ), now is the time to expand to multi-AZ for your standby cluster node, or even build in disaster recovery by deploying a third node in a different region. SIOS’ has helped dozens of customers plan multiple-AZ architectures and add disaster recovery solutions.
Build your team
Overnight some companies, and their IT teams, went from being fully on-premises to wrestling with Cloud Formation Templates, QuickStart Guides, IAM roles, internal load balancers, Overlay IPs, and deciphering what exactly that VM size means. Now is the time to build a team to support the journey to the cloud. This will mean several things:
- Adding capacity. Unless you were able to pull off a complete lift and shift, you likely have the same staff managing cloud and on-premises applications. Legacy solutions are known for being temperamental and requiring a lot of work to keep them stable and available.To navigate the cloud journey ahead you’ll need capacity capable of addressing availability requirements, understanding cloud architecture, and plotting the course forward for enterprise needs.
- Augmenting skills with training. Give your team training for the cloud. To manage and plan the course forward, look for ways to augment the IT excellence within your organization with additional training on cloud solutions, architecture, best practices, and trade-offs. A confidently trained staff will not only pay dividends in increased availability, but they will also pay dividends by addressing availability, maintenance, and growth in an economic, scalable and logical way. Translation: they’ll avoid wasting money as they build out the rest of your cloud infrastructure.
Integrating automation and analytics
As VP of Customer Experience at SIOS Technology Corp. I have worked with several companies that made the move to the cloud in 2021 without sacrificing HA, DR or their team. If you took achieving the required number of nines of uptime (99.99%) seriously and having a disaster plan was non-negotiable then it’s time to add the rigor of analytics and additional monitoring. Ensure that your availability solution has application-aware automation and orchestration for recovery in the event of a disaster or unplanned downtime. Add analytics and automation to solidify your solution and take your cloud migration up another notch from one of reactive failovers to proactive notification and mitigation of the failure before it occurs. Imagine being notified of underperforming applications, or of increasing latency, errors, or VM non-responsive behavior in time to avoid downtime in the peak business times. Analytics are also important as they can reveal systems and applications that may have escaped your original availability architecture.
Update processes and governance
Many things we think of as a failure are rooted in a failure of process. Make sure that your organization’s processes are up to date, well-documented, properly communicated and adhered to. These processes should contain a few key minimums related to who, what, when, where, and how all tied back to the business strategies, goals, and organizational needs as they pertain to the customer.
Make sure that ownership and sign-off processes for your new cloud environment are well-documented. I have seen firsthand the frustration that comes from conflicting, clashing or unresolved roles and responsibilities for customers who have moved from hardware teams that acquire infrastructure to cloud teams. Muddling through a migration is one set of pain points, digging out of a disaster without clear governance is a much bigger, more costly issue.
If you’ve made the leap to cloud, staying there and making it work for you is the next part of the journey. If your cloud journey was sudden or rocky, consider these five points for fixing your cloud journey and know that SIOS Technology can help you improve not only your high availability in the cloud, but also your processes for running in the cloud.
-Cassius Rhue, VP Customer Experience, SIOS Technology Corp.