Enabling Smarter, Safer and More Connected Vehicles

Enabling Smarter, Safer and More Connected Vehicles

This blog is authored by Clayton Donley, Vice President and General Manager, Broadcom

Long before Gartner coined the term "AIOps" in 2017, organizations hunted for anomalies in their data with algorithms that used Machine Learning and AI techniques. The continuing surge in the growth in data volumes only accelerated the practice as enterprises looked for ways to improve the efficiency and effectiveness of their IT operations.

Yet AIOps has often under-performed the inflated promises made by some and the resulting disappointment has even led some to go so far as to declare the death of AIOps.

Premature, to say the least. While IT teams can get bogged down by the overwhelming volume of alerts and KPI’s, the fact remains that when it comes to correlated application and infrastructure monitoring and incident management, AIOps remains extremely valuable for IT discovery and troubleshooting.

It’s also about to get a lot more effective thanks to the emergence of Generative Artificial Intelligence (GenAI).

Reducing time to action

AIOps has done a good job paring that data down to size while reducing the noise and duplication. But while it’s more manageable, you still need to figure out what to do next. Making diagnostic data actionable. When it comes to monitoring, there’s one golden rule: how do I reduce the mean time to resolution? But as more data flows in, there’s more information to parse, and more wild goose chases to eat up valuable time.

Big organizations with complex, mission-critical IT environments need visibility to understand issues before they snowball. They can’t afford surprises and need to be able to locate the source of a problem and the impact – whether we’re talking about replacing a hard drive or roll back to the last API version; whatever it is, they’ve got to be precise and then move fast.

However, the magnitude of the challenge has only gotten worse over the last couple of decades. A large enterprise now might have thousands of applications to observe. That means sifting through a ton of data across different domains, from hardware alerts to the base operating systems to virtualization layers, and containers. All the while, IT budgets continue to shrink.

But the past is not necessarily prologue; there are better times ahead for AIOps practitioners. We’re at the cusp of a new era where, instead of relying on mobilizing a bunch of experts to figure out a solution when something goes wrong – whether it’s a storage question, a code issue or a memory leak – systems leveraging AI will be able to advise on next steps to take and get it right the first time.

Traditional AI can help with specific tasks based on predefined rules and patterns to analyze data and make predictions. Gen AI takes things to the next level from pattern recognition to pattern creation. With GenAI, you will be able to ask a question in context and get an understandable answer. This will be the first time that computers are going to be involved in fixing themselves. Think about the impact on your support database. Instead of picking up the phone, support teams can query the system to gauge and then resolve the problem by themselves.

What’s more, GenAI will cut the time it takes to resolve tickets by helping IT teams quickly understand where to focus their attention. Instead of wasting time navigating through a veritable ocean of alerts, GenAI will be able to speed the process, analyzing and summarizing effective courses of action.

The future is fast approaching

Here’s where companies like ours are investing to improve the technology. We’re working to figure out the best ways to leverage GenAI in partnership with our customers to ensure that we're really delivering on the business outcomes that most matter to them.

While AI itself can serve up expert advice, the error rates on generative AI are still too high to simply let it just go all Terminator on your IT systems, running commands unsupervised. But it can still be a capable advisor providing a sensible course of action when you need to react to a set of alerts. You won’t need to wait until that one expert on the payroll happens to be free to look at the data. This approach is about pure efficiency. Application specialists and architects can again focus on their designated work as a completely new triage process frees up resources and capacities. Now, you can get ahead of any issues because your own “expert” will be available to review data all the time.

So, when is all this going to happen? Technology predictions are always fraught, but I think all this is going to happen a lot faster than most folks might assume.

For example, when you look at where ChatGPT was at the beginning of last year versus where it was at the beginning of this year. The improvement and the adoption rate is day and night. What’s more, we're now seeing the proliferation of AI beyond just a couple of vendors.

We're also seeing big companies pushing hard to incorporate it into their operations, just as they did in previous eras when Linux and Open Source were relatively new but incredibly promising technologies. Also, the pace of change is accelerating. The models are far better than what was available last year - in some cases, a hundred times better. And you don’t need to make big investments to hire expensive specialists to reap the benefits. Now it's something enterprises can do by making use of consumer-grade GPUs in their data center or on the cloud.

Long story short, 12 months from now, I believe it will be hard to find vendors that don't incorporate GenAI to improve the efficiency of their AIOps processes. Full self-healing and full remediation are likely a little further out but it’s on the horizon. Still, I'd be surprised if most AIOps tools five years hence cannot do some level of auto-remediation based on improvements in AI.

AI technology is changing old assumptions. We’re now on the cusp of a new era in which companies routinely – and rapidly – identify and remediate the root causes of alerts. That’s a big deal. A very big deal.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics