One of the most common questions I get when discussing IT operations with enterprises is the difference between event management and AIOps. Because of the nature of a rapidly evolving IT landscape and the freedom to invent and define terms however companies may wish, it’s understandable that IT leaders are confused. A skeptic might even say that the confusion is intentional, created by companies trying to gain an edge through marketing rather than product superiority. Whatever the case, it’s worth the effort to understand the true differences if you plan to maximize the potential of the AIOps industry for your organization.
Let’s start by asking: “What is event management?” Traditionally, event management (as defined by ITIL) is the process that monitors all events that occur through the IT infrastructure. It allows for normal operation and also detects and escalates exception conditions. The way this plays out in an IT environment is by consolidating different monitoring signals and alerts, combining them with traditional events (usually warnings, exceptions, or errors), then investigating and determining the right corrective action to take as result. Companies engage in event management efforts because they are often overwhelmed by noise from useless alerts and reduce Mean-Time-to-Resolution by better recognition and prioritization of true incidents. Correlation is a key aspect of event management so one can see that seemingly unique events are actually part of the same problem. So far, this is very straightforward. Event management is the noble pursuit toward greater efficiency and filtering of true problems. Now, let’s get to AIOps.
AIOps also has a straightforward definition. Artificial Intelligence has been defined since its inception in the mid-20th century as the capability of a machine to imitate human behavior. The “Ops” aspect of AIOps refers to IT operations, so AIOps is a machine thinking like a human to create better IT operations decisions and outcomes. So, why the confusion?
The confusion happens when we believe too much from what we read (this article as exception, of course). We’re told by marketers of event management tools that their products are now called AIOps. Rather than create a product that can actually think like a human engineer (very hard to do) and cover the full spectrum or IT operations (quite vast) from detection of a problem, reduction of noise, investigation, and remediation, it’s much easier to simply change the definition to fit what you already created. The only loser is the consumer who ends up with far less than they expected when they purchased a tool labelled as AIOps.
So, as a free service to you the IT leader trying to maximize the potential AIOps can bring to you, I offer 4 criteria to identify a true AIOps solution:
- It needs to actually detect problems rather than rely on alerts or events from other sources. The reason this is important is because if there’s a blind spot in your ability to alert on every problem in your environment, event management won’t solve it. Only a true AIOps solution can fill that gap. Relying on event management is like relying on the Tornado Guard, as seen on the world’s best website, www.xkcd.com.
- To detect all possible problems in #1 above, an AIOps solution should analyze raw data from logs instead of just alerts or events. When engineers work to solve complex IT incidents, they use raw logs because the logs are the purest source of truth and the most detailed and granular data on your systems. Because of the immense information contained in the raw data, a true AIOps platform will always analyze logs because they are the best way to see the full depth of issues you have in your system.
- It should correlate and consolidate without requiring you to tell it how to do so. Correlations abound in complex systems with massive amounts of data like today’s IT systems. A true AIOps solution will find the hidden correlations that you haven’t thought of. If it can’t, then you don’t have true AIOps.
- It should offer a solution to your problem rather than just identify it for you. If you went to a doctor and they diagnosed your illness but didn’t prescribe a treatment, you would be pretty disappointed. You should expect the same from your AIOps tool. The resolutions are out there within product knowledge bases, developer forums and IT resolution boards. If your tool is truly AIOps, it will be able to leverage these resolutions to give you a solution to your issues.
In short, don’t be fooled by event management tools that masquerade as AIOps and under-deliver on their promises. Demand that your AIOps tool has the four capabilities above and maximize the tremendous ROI that’s available to you.
Loom Systems delivers an AIOps-powered log analytics solution, Sophie,
to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and
recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down.
Get Started with AIOps Today!