With St. Patrick’s Day just around the corner, we’ll soon see green shamrocks adorning our offices and my friend Kent’s hideous shamrock golf shorts will make their unwelcomed annual appearance. With luck as a focal point of the holiday, those of us in IT should pause to consider whether our process provides assurance of the best possible practices or whether we’re rolling the dice every day that we go into work.
As the industry of AIOps has risen to prominence and we’ve moved beyond the skepticism of whether AI can work in IT Operations, a majority of enterprises are still underutilizing AI in their workflows and still relying on luck to catch their problems. We at Loom Systems offer 5 questions to identify whether you’re too reliant on luck in your IT Operations process and offer tips on what to do if you are.
1) Can you define the type of problems that will occur over the next month?
Most people would answer no to this question simply due to the complexity of their environments and the infinite number of combinations of ways things can break. If you’re in this group, simply applying more effort to creating thresholds and dashboards for the problems you can foresee will be a futile exercise because it’s just a matter of time before an unforeseen problem catches you off guard. The only way to be truly prepared for the unknown is to monitor every datapoint within your logs and to apply the unlimited capacity of AI to measure each one vs. their normal behaviors.
2) Do you accurately escalate tickets to the correct team every time?
One of our customers described their problem before Loom as the desire for each team to optimize their Mean-Time-To-Innocence (MTTI) by proving a critical issue that is not rooted in their area of oversight. This happens because of the skills gap within enterprises. Each area of the stack requires such a depth of expertise that it’s impossible for single individuals or teams to have full knowledge of the way their piece of the puzzle interacts with all of the others. The solution to this is to have an AIOps tool that can automatically correlate issues and accurately identify a proper root cause among many other symptoms.
3) When new data is introduced, does your team know the proper KPIs to use to indicate a problem?
Anyone who answers “yes” to this question must qualify it with “as long as nothing new is introduced to the environment.” Unfortunately, none of us have the luxury to be in charge of static environments. This is an era of transformation in IT; whether it’s digital transformation, cloud migration, containerization, serverless, etc., your organization is probably heavily committed to period of change for the foreseeable future. And, since every change introduces a new unknown risk that something will break in an unpredictable way, you can’t be reliant on setting proper KPIs. Instead, use an AIOps tool that is able to understand new data immediately and begin detecting abnormal behavior immediately without requiring direction from a human on the proper KPI to detect an issue.
4) Once an issue is identified, are the answers available at your fingertips or is judgment required to apply the right remediation?
Most IT leaders answer no to this because there is no single repository of resolutions that can offer resolutions across the full IT stack. But, in the age of AI where my 4-year old can ask Alexa any question that comes to his mind and get a polite and accurate answer from her, why should our highly paid IT engineers have to search google for resolutions to their problems? The answer is that they shouldn’t. Instead, they should use an AIOps solution with a built in repository to match any issue found to the proper resolution to fix it.
5) Do you have the appropriate headcount to detect and remediate key issues before causing financial loss to the business?
The IT industry is experiencing rapid growth and hiring demands have never been stronger. Which means that unless you’re a unicorn or offer onsite haircuts and car washes, you’re probably having a tough time filling your open engineering roles. You can play the game of offering higher and higher salaries to attract employees at the risk of shrinking margins and higher operational costs, or you can fill the gap with AIOps. The AI approach is a win-win; hiring pressures subside and existing employees are able to be more effective in their roles, creating higher levels of satisfaction and encouraging longer retention.
St. Patrick’s Day is one of the year’s best holidays for its good natured fun. But, this year, while you’re putting drops of green food coloring into your beer (or kombucha, if that’s your style), think about whether it’s time to rely on the science of AIOps and reduce your reliance on luck in your IT operations process.
Interested in knowing if your organization is ready for AIOps? Register to our upcoming webinar where Trace3 AIOps expert David Ishmael and myself will discuss the process you can follow to identify if you are ready. Tuesday, March 19, 1PM PST/4PM EST.
Loom Systems delivers an AIOps-powered log analytics solution, Sophie,
to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and
recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down.
Get Started with AIOps Today!