Tag Archives: Machine Learning

Are You Over Provisioning Your Virtual Infrastructure?

Right-Sizing VMware Environments with Machine Learning

According to leading analysts, today’s virtual data centers are as much as 80 percent overprovisioned – an issue that is wasting tens of thousands of dollars annually. The risks of overprovisioning virtual environments are urgent and immediate. IT managers face a variety of challenges related to correctly provisioning a virtual infrastructure. They need to stay within budget while avoiding downtime, delivering high performance for end-user productivity, ensuring high availability and meeting a variety of other service requirements. IT often deals with their fear of application performance issues by simply throwing hardware at the problem and avoiding any possibility of under-provisioning.  However, this strategy is driving costly over spending and draining precious IT time.  And even worse, when it comes time to compare the economics of on-premises hosting vs cloud, the costs of on-premises infrastructures are greatly inflated when the resources aren’t efficiently being used.  This can lead to poor decisions when planning a move to the cloud.

With all of these risks in play, how do IT teams know when their VMware environment is optimized?

Having access to accurate information that is simple to understand is essential.  The first step in right-sizing application workloads is understanding the patterns of the workloads and the resources they consume over time.  However, most tools take a simplistic approach when recommending resource optimization.  They use simple averages of metrics about a virtual machine.  This approach doesn’t give accurate information. Peaks and valleys of usage and interrelationships of resources cause unanticipated consequences for other applications when you reconfigure them.  To get the right information and make the right decisions for right-sizing, you need a solution such as SIOS Iq.  SIOS iQ applies machine learning to learn patterns of behavior of interrelated objects over time and across the infrastructure to accurately recommend optimizations that help operations, not hurt them.  Intelligent analytics beats averaging every time.

The second step towards a right-sizing strategy is eliminating the fear of dealing with performance issues when a problem happens or even preventing one in the first place.  This means having confidence that you have the accurate information needed to rapidly identify and fix an issue instead of simply throwing hardware at it and hoping it goes away.

Today’s tools are not very accurate. They lead IT through a maze of graphs and metrics without clear answers to key questions. IT teams typically operate and manage environments in separate silos — storage, networks, applications and hosts each with its own tools. To understand the relationships among of all the infrastructure components requires a lot of manual work and digging.  Further, these tools don’t deliver information, they only deliver marginally accurate data. And they require IT to do a lot of work to get that inaccurate data. That’s because they are threshold-based. IT has to set individual thresholds for each metric they want to measure –  CPU utilization, memory utilization, network latency, etc.. A single environment may need to set, monitor, and continuously tune thousands of individual thresholds. Every time the environment is changed, such as when a workload is moved or a new VM is created, the thresholds have to be readjusted. When a threshold is exceeded, these tools often create thousands of alerts, burying important information in “alert storms” with no root cause identified or resolution recommended.

Even more importantly, because these alerts are triggered off measurements of a single metric on a single resource, IT has to interpret the meaning and importance.  Ultimately the accuracy of interpretation is left to the skill and experience of the admin. When systems are changing and growing so fast and IT simply can’t keep up with it all- and the easiest course of action is to over-provision; wasting time and money in the process. Moreover, the actual root cause of the problem is often never fully addressed.

IT teams need smart tools that leverage advanced machine learning analytics to provide an aggregated, analyzed view of their entire infrastructure. A solution such as SIOS iQ helps to optimize provisioning, characterize underlying issues and identify and prioritize problems in virtual environments. SIOS iQ doesn’t use thresholds. It automatically analyzes the dynamic patterns of behavior between the related components in your environment over time. It automatically identifies a wide variety of wasted resources (rogue vmdks, snapshot waste, idle VMs). It also recommends changes to right-size all over- and under-provisioned VMs.

When it detects anomalous patterns of behavior, it provides a complete analysis of the root cause of the problem, the components affected by the problem, and recommended solutions to fix the problem. It not only recommends optimal provisioning of vCPU, vMem, and VMs, but also provides a detailed analysis of cost savings that its recommendations can deliver. Learn more about the SIOS iQ Savings and ROI calculator.

Here are three ways machine learning analytics can help avoid overprovisioning:

  1. Understand the causes of poor performance: By automatically and continuously observing resource utilization patterns in real-time, machine learning analytics can identify over- and undersized VMs and recommended configuration settings to right-size the VM for performance. If there’s a change, machine learning can dynamically update the recommendations.
  2. Reduce dependency on IT teams for resource sizing: App owners are often requesting as much storage capacity as possible, while VMware admins want to limit storage as much as possible. Machine learning analytics takes the guess work out of resource sizing and eliminates the finger-pointing that often happens among enterprise IT teams when there’s a problem.
  3. Eliminate unused or wasted IT resources: SIOS iQ will provide a saving and ROI analysis of wasted resources, including over-provisioned VMs, rogue VMDKs, unused VMs, and snapshot waste. It also provides recommendations for eliminating them and calculates the associated costs saving in both CapEx and Opex.
  4. Determine whether a cluster can tolerate host failure: With machine learning analytics, IT pros can easily right-size CPU and storage without putting SQL Server or end user productivity at risk. IT teams gain a deeper understanding into the capacity of the organization’s hosts and know whether a cluster can tolerate failure or other issues.

To learn more about how right-sizing your VMware environment with machine learning can save time and resources, check out our webinar: “Save Big by Right Sizing Your SQL Server VMware Environment.

Understanding The Emerging field of AIOps – Part II

This is the second post in a two-part series highlighting how AIOps is changing IT performance optimization. Part 1 explained the basic principles of AIOps. The original text of this series appeared in an article on Information Management.  Here we look at the business requirements driving the trend to AIOps.

Why do businesses need AIOps?

IT pros move more of their business-critical applications into virtualized environments. As a result, finding the root cause of application performance issues is more complicated than ever.  IT managers have to find problems in a complex web of VM applications, storage devices, network devices and services. These components that are connected in ways IT can’t always understand.

Often, the components a VMware or other virtual environment are interdependent and intertwined. When an IT manager moves a workload or makes a change to one component, they cause problems in several other components without their knowledge. If the components are in different so-called silos (network, infrastructure, application, storage, etc.), IT pros have even more trouble figuring out the actual cause of the problem.

Too Many Tools Required to Find Root Causes of Performance Issues

AIOPs Survey
SIOS AIOPS Survey

The process of correlating IT performance issues to its root cause is  difficult, if not impossible for IT leaders.  According to a recent SIOS report, 78 percent of IT professionals are using multiple tools to identify the cause of application performance issues in VMware. For example, they are using tools such as application monitoring, reporting and infrastructure analytics.

Often, when faced with an issue, IT assembles a team with representatives from each IT silo or area of expertise. Each team member uses his or her own diagnostic tools and looks at the problem their own silo-specific perspective. Next, the team members compare the results of their individual analyses identify common elements. Frequently, this process is highly manual. They look at changes in infrastructure that show up in several analyses in the same time frame. As a result, IT departments are wasting more and more of their budget on manual work and inaccurate trial-and-error inefficiencies.

To solve this problem and reduce wasted time, they are using an AIOPs approach. AIOps applies artificial intelligence (i.e., machine learning, deep learning) to automate problem-solving. The AIOPs trend is an important shift away from traditional threshold-based approaches that measure individual qualities (CPU utilization, latency, etc.) to a more holistic data-driven approach. Therefore, IT managers are using analytics tools to analyze data across the infrastructure silos in real-time. They are using advanced deep learning and machine learning analytics tools that learn the patterns of behavior between interdependent components over time.  As a result, they can automatically identify behaviors between components that may indicate a problem. More importantly, they automatically recommend the specific steps to resolve problems.

What’s Next for AIOps?

Virtual IT environments are creating an enormous volume of data and an unprecedented level of complexity. As a result, IT managers cannot manage these environments effectively with traditional, manual methods. Over the next few years, the IT profession will rapidly move from the traditional computer science approach to a modern “data science” AIOPs approach. For IT teams, this means embracing machine learning-based analytics solutions, and understanding how to use it to solve problems efficiently and effectively. Finally, executives need to work with their IT departments to identify to right AIOps platform for their business.

Read Part 1

Roadblocks to Optimizing Application Performance in VMware Environments – Part I

This is the first post in a two-part series highlighting challenges IT teams face in optimizing VMware performance. The original text of this series appeared in an article on Data Informed.

When virtual computing first became popular, it was primarily used for non-business critical applications in pre-production environments, while critical applications were kept on physical servers. However, IT has warmed up to virtualization, recognizing the many benefits (reduced cost, increased agility, etc.) and moving more business-critical and database applications into virtual environments. In a recent survey of 518 IT professionals we conducted, we found that 81 percent of respondents are now running their business-critical applications, including SQL Server, Oracle or SAP, in their VMware environments.

VMware Performance Becomes Critical as More Important Applications Virtualized

While there are numerous benefits, virtualized environments introduce a new set of challenges for IT professionals. For IT teams tasked with finding and resolving VMware performance issues, specifically those that can impact business-critical applications, many find they are hitting the same cumbersome roadblocks related to tools, time and strategy.

IT Pros Need Multiple Tools to Gain a Holistic View of their VMware Environments.

vmware_performance_monitoring_toolsAccording to the survey results, 78 percent of IT professionals are using multiple tools– including application monitoring, reporting and infrastructure analytics– to identify the cause
of VMware performance issues for important applications. Even further, ten percent of IT professionals are using more than seven tools to understand their VMs and the issues that affect VMware performance. Optimizing VMWare performance and availability is incredibly complex, and the dynamic nature of these environments require highly advanced tools to address even the most standard performance issues.

Relying on several reporting tools every time an issue arises just isn’t sustainable for most IT teams. This is partly due to the fact that solving application performance issues requires a view of multiple IT disciplines or “silos” such as application, network, storage and compute. In larger organizations, that means each time an issue arises, representatives from each discipline need to come together and compare their findings– and the analysis results from the application team’s tool may point to a somewhat different cause than the storage team or the network team’s tool. The current strategy of relying on multiple tools and teams to evaluate each silo leaves IT with the manual, trial and error task of finding all the relevant data, assembling it and analyzing it to figure out what went wrong and what changed to cause the problem.

Stay tuned for part two of this series, where we’ll discuss issues related to time and resources wasted in uncovering issues, as well as finding the root cause of VMware performance issues.

Read Roadblocks to Optimizing Application Performance VMware Environments – Part II

Expert Advice on High Availability SQL Server and Machine Learning Analytics – Blogs, Webinars and Live Events

SIOS experts in high availability SQL Server cluster protection routinely share their knowledge and provide expert advice through webinars, blogs, and live events.

Blog Post: Step-by-Step Guide to High Availability SQL Server v.Next Linux High Availability
Click here to learn how to deploy a Linux VM in Azure running SQL Server and how to configure a 2-node failover cluster to make it highly available without the need for shared storage. https://clusteringformeremortals.com/category/high-availability/

Blog Post:  Deploying a Highly Available File Server in Azure IAAS (ARM) with SIOS DataKeeper This blog is a step-by-step guide to deploying a two-node File Server Failover Cluster in a single region of Azure using Azure Resource Manager.

Live Webinar: December 15, 2016 –  Webinar: Keeping the Peace with Your Sys Admins while Providing High Availability SQL Server. Join SQL Server MVP Dave Bermingham as he walks through common use cases that cause conflict between SQL DB Admins and VM Admins when SQL Server HA is involved. He will also describe a simple, conflict-free way for SQL Server admins to get what they need for a successful implementation of SQL Server. Register here.

Live Webinar: December 8, 2016 – Understanding vSphere Analytics: Machine Learning vs Threshold-basedJoin David Davis, 8-time vExpert and Partner at ActualTechMedia and Experts from SIOS Technology in this real-world, practical, hands-on webinar! Find out the differences between different vSphere analytics approaches and how to choose a solution that’s best for you. Register here.

Recorded Webinar – VMware Guest Based SQL Server High Availability Clusters – Ways to Protect SQL and Maintain Flexibility. Watch this recorded webinar and hear Dave Bermingham, Microsoft Clustering and Datacenter MVP and Tony Tomarchio, Director of Field Engineering at SIOS discuss ways to create a high availability cluster to protect SQL in a VMware environment without sacrificing IT flexibility or important VMware features, and whether you can have a cluster and multi-site replication with VMware. Register here.

SIOS: Essential for Mission-Critical VMware Environments

jason-96Guest Blog: Jason Bloomberg, Intellyx

Virtualization has unquestionably become a critical and ubiquitous feature of the enterprise operational environment, and VMware clearly predominates. Enterprises rely upon their VMware technology to support mission-critical infrastructure, including the systems of record that run the business.

Avoiding VMware-related performance issues – especially when such issues may lead to downtime – has thus become a mission-critical priority. And yet, finding the root cause of application performance issues in VMware vSphere environments can be difficult and costly.

The interrelated nature of infrastructure and applications – particularly for systems of record like Microsoft SQL Server – can obscure the root cause of performance issues. As a result, SQL database administrators, VMware infrastructure managers, network managers and perhaps other domain experts have to collaborate to find the real cause and devise a solution.

Implementing a solution in such complex, distributed VMware installations, however, is itself difficult to achieve in practice. Such solutions often take an expert – but such experts can be hard to find, and many shops find themselves making do with less skilled VMware professionals.

In other cases, VMware shops turn to third-party consultants for the proper configuration. Even assuming the consultants do their job properly, once they hand over the environment to their customer then maintaining it once again requires skills that may be in scant supply.

For today’s enterprises that depend upon VMware, however, there are rarely any viable alternatives. Their environments continue to grow in size and complexity, and their businesses increasingly depend upon them as the mission-critical infrastructure they are.

SIOS: Addressing VMware Complexity

This growth in complexity combined with scarce expert VMware skills is the challenge that SIOS addresses with their SIOS iQ product. SIOS fills this gap in the existing VMware management tools used to address infrastructure problems in complex, dynamic virtualized environments.

SIOS continues to enhance the capabilities of SIOS iQ since launching the product in 2015, helping its customers understand complex IT operations and resolve issues in dynamic VMware environments.

SIOS focuses on the needs of IT Operations Managers and application administrators to address the root causes of performance problems, identify underused resources, and optimize configurations to help them get the most value from their virtualized environment.

SIOS was among the first in the industry to integrate machine learning into its infrastructure analytics as well as deep database performance analytics. It uses advanced machine learning to eliminate false positives and alert storms, thus providing customers with the information they need in an easy to use, graphical interface.

SIOS iQ can also provide instantaneous root cause analytics and recommendations with deep database performance monitoring and optimization. In essence, SIOS iQ is a system that is able to learn about an organization’s virtual infrastructure and how it operates on a day-to-day basis. It can identify anomalies before they become serious issues, thus the chance of false alarms without human intervention.

New Capabilities from SIOS iQ

SIOS is now announcing additional capabilities for predicting and forecasting performance and capacity utilization in complex VMware environments. In addition, SIOS iQ users can now more clearly define Microsoft SQL Server application-specific root causes of performance issues by leveraging SQL Sentry Performance Advisor.

SIOS iQ supports the integration and correlation of Performance Advisor’s custom and standard events out of the box. As a result, users can immediately know where a problem started and whether it is infrastructure or application-specific. Instead of spending time gathering and reviewing data or arguing between departments, users can now directly take action to correct the problem before it becomes critical.

SQL Server is among the most popular databases that run on the VMware platform, explaining why SIOS leveraged its partnership with SQL Sentry to implement SQL Server-specific performance analysis. In addition, SIOS is planning on supporting a variety of systems of record, as the diagram below illustrates.

SIOS iQ Application-Specific Performance Monitoring and Root Cause Analysis (Source: SIOS)

The new capabilities from SIOS iQ directly correlate observed performance anomalies with intelligent performance-related events from deep within the SQL Server platform. It is now possible to link from a SQL performance alert within SIOS iQ to SQL Sentry in context.

Furthermore, SQL Sentry users can create their own events with description and appropriate tags for correlation with SIOS iQ, thus correlating events from the applications down to the infrastructure.

In addition to its newly added support for SQL Server, SIOS iQ also filters for selective analysis across vSphere Clusters. SIOS iQ shows the environment by each cluster, enabling users to select which cluster to view in the SIOS Dashboard. As a result, they can observe all environments together or isolate the view to individual cluster – even to specific events taking place within the cluster.

The Intellyx Take

The more complex the VMware environment, the more important it becomes to have a platform like SIOS iQ in place – and many of today’s enterprise VMware deployments are extraordinarily complex.

The fact that many enterprise systems of record now run on VMware environments ups the stakes for VMware performance. Systems of record are mission critical – and as organizations become increasingly software-driven, digitally transformed enterprises, this mission criticality only becomes more central to the viability of the business itself.

Complexity, however, is the enemy of mission criticality. Whether it be gaps in high availability, misconfigured VM instances, or other issues with capacity, performance, or availability, the list of things that can go wrong continues to explode. And as is particularly true in virtualized operational environments, what can go wrong eventually will.

With SIOS iQ, SIOS is bringing together all the elements that make up a monitoring and root cause analysis platform that modern enterprises with such complex, important VMware investments require. And in spite of the challenges with VMware’s complexity, it’s not going anywhere any time soon – and neither is SIOS.

Copyright © Intellyx LLC. SIOS is an Intellyx client. At the time of writing, none of the other organizations mentioned in this paper are Intellyx clients. Intellyx retains full editorial control over the content of this paper.

Yahoo Finance: SIOS CTO Sergey Razin to Discuss Machine Learning as a Key Ingredient in IT Operations Analytics at the Seattle MLconf

SIOS Technology Corp. (www.us.sios.com), maker of SAN and #SANLess clustering software products, today announced its CTO Sergey A. Razin, Ph.D will present a session at the Machine Learning Conference (MLconf) taking place this week in Seattle, WA about using Machine Learning to optimize the performance, efficiency and reliability of large, complex virtual and cloud environments.

MLconf events host speakers from various industries, research arenas and universities to discuss recent research and applications of Machine Learning methodologies and practices. Titled, “Machine learning as the key ingredient for making the ‘self-driving’ data center a reality,” Dr. Razin’s session is scheduled for Friday, May 1 at 12:20 PM at 415 Westlake located at 415 Westlake Ave N., Seattle, WA. For more information about the MLconf in Seattle or to register, visit here.

View the full article at Finance.Yahoo.com

MorningStar: SIOS CTO Sergey Razin to Discuss Machine Learning as a Key Ingredient in IT Operations Analytics at the Seattle MLconf

SIOS Technology Corp. (www.us.sios.com), maker of SAN and #SANLess clustering software products, today announced its CTO Sergey A. Razin, Ph.D will present a session at the Machine Learning Conference (MLconf) taking place this week in Seattle, WA about using Machine Learning to optimize the performance, efficiency and reliability of large, complex virtual and cloud environments.

MLconf events host speakers from various industries, research arenas and universities to discuss recent research and applications of Machine Learning methodologies and practices. Titled, “Machine learning as the key ingredient for making the ‘self-driving’ data center a reality,” Dr. Razin’s session is scheduled for Friday, May 1 at 12:20 PM at 415 Westlake located at 415 Westlake Ave N., Seattle, WA. For more information about the MLconf in Seattle or to register, visit here.

View the full article at MorningStar.com

Wall Street Select: SIOS CTO Sergey Razin to Discuss Machine Learning as a Key Ingredient in IT Operations Analytics at the Seattle MLconf

SIOS Technology Corp. (www.us.sios.com), maker of SAN and #SANLess clustering software products, today announced its CTO Sergey A. Razin, Ph.D will present a session at the Machine Learning Conference (MLconf) taking place this week in Seattle, WA about using Machine Learning to optimize the performance, efficiency and reliability of large, complex virtual and cloud environments.

MLconf events host speakers from various industries, research arenas and universities to discuss recent research and applications of Machine Learning methodologies and practices. Titled, “Machine learning as the key ingredient for making the ‘self-driving’ data center a reality,” Dr. Razin’s session is scheduled for Friday, May 1 at 12:20 PM at 415 Westlake located at 415 Westlake Ave N., Seattle, WA. For more information about the MLconf in Seattle or to register, visit here.

View the full article at WallStreetSelect.com

Street Insider: SIOS CTO Sergey Razin to Discuss Machine Learning as a Key Ingredient in IT Operations Analytics at the Seattle MLconf

SIOS Technology Corp. (www.us.sios.com), maker of SAN and #SANLess clustering software products, today announced its CTO Sergey A. Razin, Ph.D will present a session at the Machine Learning Conference (MLconf) taking place this week in Seattle, WA about using Machine Learning to optimize the performance, efficiency and reliability of large, complex virtual and cloud environments.

MLconf events host speakers from various industries, research arenas and universities to discuss recent research and applications of Machine Learning methodologies and practices. Titled, “Machine learning as the key ingredient for making the ‘self-driving’ data center a reality,” Dr. Razin’s session is scheduled for Friday, May 1 at 12:20 PM at 415 Westlake located at 415 Westlake Ave N., Seattle, WA. For more information about the MLconf in Seattle or to register, visit here.

View the full article at StreetInsider.com

Investor Place: SIOS CTO Sergey Razin to Discuss Machine Learning as a Key Ingredient in IT Operations Analytics at the Seattle MLconf

SIOS Technology Corp. (www.us.sios.com), maker of SAN and #SANLess clustering software products, today announced its CTO Sergey A. Razin, Ph.D will present a session at the Machine Learning Conference (MLconf) taking place this week in Seattle, WA about using Machine Learning to optimize the performance, efficiency and reliability of large, complex virtual and cloud environments.

MLconf events host speakers from various industries, research arenas and universities to discuss recent research and applications of Machine Learning methodologies and practices. Titled, “Machine learning as the key ingredient for making the ‘self-driving’ data center a reality,” Dr. Razin’s session is scheduled for Friday, May 1 at 12:20 PM at 415 Westlake located at 415 Westlake Ave N., Seattle, WA. For more information about the MLconf in Seattle or to register, visit here.

View the full article on InvestorPlace.com