White Paper: Cost Savings in AWS with SQL Server High Availability
Thank you for registering
Why do organizations choose the cloud? | SQL Server Cloud High Availability Options | Why did PayGo move SQL Server to the AWS Cloud? | PayGo’s Experience with AWS and SIOS | Traditional SQL Server Clustering vs. SIOS DataKeeper | Multi-Node Clusters | SIOS DataKeeper Cost Savings | How do I get started with SIOS?
Making a change in SQL Server infrastructure from on-premises to the cloud is often a complex undertaking. As most companies begin the process, there are generally numerous questions from both the business and technology teams, such as:
- How to seamlessly move to the cloud?
- How to reduce costs in the cloud?
- How to have a more robust and flexible solution?
- How will the new infrastructure perform?
- How to improve business continuity moving to the cloud?
In this white paper, learn about these considerations when running SQL Server in the cloud and how one company, PayGo, moved to AWS and improved their business continuity with SIOS DataKeeper.
Every customer, every workload has its own reasons for moving to the cloud. Overall, organizations select a cloud solution to take advantage of new capabilities that were previously not available with their on-premises or legacy platform including:
- Management Ease – Cloud offerings remove a number of the burdens relating to managing infrastructure and hardware in order to focus on your core business
- Reliability – Improved uptime for hardware
- Performance – Access to high-performance platform to meet application and user needs
- Flexibility – Selection of architecture, hardware, sizing, software and geographic choices that may be cost-prohibitive with an on-premises solution
- Security – The scary security concerns that have been a deterrent in the past to cloud adoption have been addressed with domestic and international compliant platforms that are mature
- Scalability – Ability to easily scale your resource needs as your organization grows, seasonal changes occur, new applications get deployed and more
- Cost Control – Configurable architecture to reduce costs for development, test and production workloads as well as shift budgets from capital to operational expenditures
- Disaster Recovery – Leverage a global footprint of data centers in a region or around the globe as a portion of the equation to continue business operations to prevent a site failure from causing downtime
Although some organizations consider moving to the cloud as losing some control over their infrastructure, the cloud enables new opportunities for the technology team to focus on core business initiatives with cost savings.
While the cloud offers additional redundancy in terms of power, connectivity and particular infrastructure, a solution is still needed to protect your SQL Server instances. Whether you choose SQL Server Enterprise or Standard Edition to implement high availability in the cloud, it is generally recommended to use synchronous replication between two or more availability zones (AZ) in the same region. . This includes a degree of disaster recovery even within the region since the failure of an entire datacenter (AZ) does not impact availability. But with a multi-region implementation, there is further separation. For instance, you can replicate from one region in the US and another region in Europe or Asia. For multi-region replication you would use asynchronous replication.
Another key component for making the decision to move to the cloud is to meet the ever-increasing service level agreements (SLA) needed by organizations to support a 24×7 user experience. The cloud offers an attractive SLA for hardware, power and connectivity, but a true SLA is more than that. It ensures the application is available to fulfill the user needs in case an application hangs or fails. This is also the case with critical patching processes, where is it imperative to meet user needs and also ensure the latest versions of software are installed to proactively prevent issues. This can be achieved with clustering and data replication technology supported in the public cloud.
PayGo is a privately held integrated utility payment solution provider that manages the largest energy company prepayment programs in the United States. PayGo’s original SQL Server platform consisted of an on-premises SQL Server cluster with two physical servers in the same rack. As their business scaled, this platform quickly became critical for business operations. This solution had to be simple enough for their small team with a limited amount of manpower to implement and maintain the platform. PayGo ran SQL Server Standard Edition using an active passive SQL cluster and SIOS on the backend to handle the shared storage functionality.
In 2014, PayGo faced a decision point for either a large capital expense for a hardware refresh or migrating the solution into the cloud. At that time, AWS was their clear-cut choice. One challenge they faced is that a SAN is not available in AWS. This is a problem they faced really early on even with their on-premises solution, because none of the PayGo Team had enough experience to be able to administer a SAN, that’s why they reached out to SIOS.
When PayGo moved to their SQL Server solution to AWS supported by SIOS DataKeeper, they experienced:
Low Learning Curve – The migration from on-premises to AWS was achieved with minimal architecture changes and a low learning curve for the PayGo Team.
International Presence – From time to time, PayGo has new customers running their services in Europe and the Middle East. Without AWS, it would be cost-prohibitive for them to expand into these geographies.
Cost Reduction – At the inception, PayGo moved two environments from on-premises locations to AWS. PayGo was spending about $6,000 a month for data center costs on-premises and PayGo was able to replicate the solution in AWS for about $900 a month, which is about one sixth of the cost. And that cost reduction has continued even though they have grown significantly.
Disaster Recovery – One architecture change that PayGo made related to moving the SQL Server instances into different availability zones in AWS. Having each node in a different AZ gave PayGo not only machine high availability, but also geographic high availability. Fortunately, the AZs are far enough apart that they are isolated from events that would affect electrical service, cooling service, internet service, etc. This enables PayGo to meet all of their customer needs without interruption.
Low Latency – Another benefit with the AWS AZs is that the latency is so low, it can effectively be treated like a LAN. When PayGo moved to AWS in 2014, their network latency was less than 10 milliseconds, which was generally considered LAN speed. PayGo’s experience with AWS network latency between nodes in different availability zones is less than one millisecond. This enabled them to split the nodes into two Availability Zones resulting in geographic high availability, and was really appealing to their customers. And it was something that PayGo was paying quite a bit of money for in their previous on-premises data center model. And in fact, it
was not high availability, it was really disaster recovery with a four-hour failover window.
Solid SIOS Solution – PayGo has grown since the original implementation into four customer environments. They are running four, two-node SQL Server clusters and the SIOS clustering solution has been very solid the whole time. One of the things that PayGo’s technology team really enjoys about SIOS DataKeeper is that once it is setup, you truly don’t have to think about it. PayGo has only needed to call SIOS for support five times in five years. DataKeeper has been reliable and meets the needs of the business.
SIOS Integration – Beyond DataKeeper’s integration with AWS EC2 platform, SIOS DataKeeper also seamlessly integrates with Windows Failover Clustering. When you want to perform a failover, either manual or automatic, there’s only one place to go: Windows Failover Cluster Manager. So, it’s been really solid not just for SQL Server, but also for File and FTP Servers because DataKeeper replicates at a block level.
A traditional SQL Server cluster looks something like the image below with multiple nodes connected to shared storage. Unfortunately, a shared SAN storage device is not available in the cloud. This is where SIOS DataKeeper delivers a unique solution consisting of application orchestration and data replication to bring high availability and disaster recovery to the SQL Server platform.
SIOS DataKeeper for SQL Server enables customers to build failover cluster instances in the cloud with no shared storage device that would normally be required for a failover cluster instance to work. What SIOS DataKeeper does is deliver a SANLess cluster, which is a true Microsoft failover cluster, and is fully supported by Microsoft. Everything is exactly the same, except instead of shared storage, DataKeeper leverages locally attached storage on each node and replicates the data at a block level. In the cloud, SIOS DataKeeper is designed to attach to local storage on each cluster node that is configured the same (same volume, same size, same drive letter, etc.) in order to perform block-level replication between the different instances. DataKeeper performs application orchestration with Windows Failover Clustering and high-speed data replication for SANLess clustering in the cloud, on-premises or via a hybrid architecture.
SIOS DataKeeper value is:
- Simplicity – Remove the complexities of building and maintaining native SQL Server Availability Groups, especially if customers have multiple groups
- Holistic – Failover the entire instance at once with Windows Cluster Manager
- Support Everywhere – Cloud, on-premises or hybrid solution
- 100% SQL Server Protection – Protect all user defined and system databases (Master, Model, MSDB, etc.) including SQL Server Agent Jobs, Logins, SSIS Packages, etc. which is not possible with SQL Server Availability Groups
- Scalability – No dependence on Distributed Transaction Coordinator, which does not always scale with a large number of databases
- Block Level Replication – Ability to replicate more than just SQL Server, such as File Servers, FTP, etc. with high performance between cluster nodes
- High Availability and Disaster Recovery – Architecture to support local, remote, and multiple site
- Cost Reduction – Rock solid clustering solution on-premises or in the cloud with SQL Server Standard Edition, which is a fraction of the cost of SQL Server Enterprise licensing
SIOS DataKeeper can also be used to create multi-node clusters for added disaster protection. As shown in the diagram below, data can be replicated to multiple, geographically separated nodes.
The cost savings when comparing a two-node native SQL Server Cluster running on Enterprise Edition with Always On Availability Groups with Software Assurance versus a two-node SQL Server Cluster running on Standard Edition with SIOS DataKeeper and Software Assurance is a savings of 58% to 71% with the SIOS solution. The cost savings increase as the number of CPU cores scale. This enables server consolidation and the introduction of high availability and disaster recovery at a lower cost with simplified administration.
To help you get started, SIOS offers free trial versions of both SIOS DataKeeper for Windows Server and the SIOS Protection Suite for Linux, and these are available on the Web at us.sios.com. SIOS also offers comprehensive documentation, an assortment of templates that automate all or part of application-specific and/or cloud-specific configurations, responsive support, and a variety of other useful resources to help ensure successful deployments. To learn more about how your organization can benefit from the carrier-class HA and DR protection afforded by SANless failover clustering from SIOS Technology, please contact SIOS by phone at (650)645-7000 or by email at firstname.lastname@example.org.
However you choose to protect your SQL Server databases, keep in mind that the only thing harder than doing something—anything—to better prepare for recovering from a disaster is trying to explain why you didn’t.