Home / Trends / The next big thing in IT operations – Gigaom

The next big thing in IT operations – Gigaom

IT operations have changed dramatically in the past two decades, but none is more important than the introduction of artificial intelligence (AI) and machine learning (ML) to accelerate, improve, and automate the monitoring and management of IT infrastructures . AIOps tools have been using big data and ML in their daily operations since 2017 and promise to become an important tool for IT organizations of all sizes.

But what else is AIOps? Let’s take a look at the basics of technology, examine what it was designed for, and see how it evolves.

What is AIOps?

By using big data and ML in conventional analysis tools, AIOps can automate some parts of IT operations and optimize other elements through knowledge from data. The goal is to reduce the time spent on IT operations teams through administrative and repetitive activities that are still critical to the operation of the larger company.

AI-enabled Ops solutions can learn from the data that companies create about their daily processes and transactions. In some cases, the tools can diagnose and correct problems using pre-programmed routines, such as: For example, restarting a server or blocking an IP address that appears to be attacking one of your servers. This approach has several advantages:

  1. It removes people from many processes and only alarms when an intervention is required. This means fewer operating personnel and lower costs.
  2. It integrates AIOps into other business tools such as DevOps or governance and security processes.
  3. It can spot trends and be proactive. For example, an AIOps tool can monitor an increase in the errors logged by a switch and predict that a failure is imminent.

AIOps categorization

AIOps is an existing category of tools known as CloudOps and Ops tools that are used for AI subsystems. This leads to a number of new functions, such as:

  • Predictive error detection: This is accomplished by using ML to analyze the activity patterns of similar servers and determine what has caused a failure in the past.
  • Self-healing: If a problem is identified with the cloud-based or on-premises component, the tool can take pre-programmed corrective actions, e.g. B. restart a server or disconnect from a faulty network device. This should complete 80 percent of the Ops tasks that are now automated for everyone except the most critical problems.
  • Connecting to remote components: The ability to connect to remote components such as servers and network devices inside and outside of public clouds is critical to the effectiveness of an AIOps tool.
  • Custom views: Information dashboards and views should be configurable for specific roles and tasks to promote productivity.
  • Engaging infrastructure concepts: This refers to the ability to collect operational data from storage, network, computer, data, application and security systems and to manage and repair them.

We can divide AIOps into four categories: active, passive, homogeneous and heterogeneous:

Active refers to tools that can fix system problems identified by the AIOps system themselves. With this proactive automation, in which detected problems are automatically fixed, the full value of AIOps is available. With Active AIOps, companies can hire fewer Ops engineers while significantly increasing availability.

Passive AIOps can see but not touch. You are unable to take corrective action if problems are identified. However, many passive AIOps providers work with third-party providers of tools to enable them to act autonomously. This approach typically requires some DIY engagement from IT organizations to be implemented.

Passive AIOps tools are largely data-oriented and spend their time collecting information from as many data points as they can connect. They also offer real-time and analytics-based data analytics to enable impressive dashboards for operational professions.

These AIOps tools are on a single platform and use, for example, AI resources that come from a single cloud provider such as Amazon AWS or Microsoft Azure. While the tool can manage services such as storage, data and computers, it can only do so on the platform of this one provider. This can compromise effective operations management for those deploying hybrid or multi-cloud deployments.

Most AIOps tools are heterogeneous, which means that they can monitor and manage a variety of different cloud brands, as well as native systems operated within the cloud providers. In addition, these AIOps tools can manage traditional local systems and even mainframes, as well as IoT and edge-based computing environments.


AIOps provides efficiency and automation opportunities that lower costs for businesses and give IT operations time to invest in more valuable activities elsewhere. As the field evolves, the tools will evolve, innovate and develop new skills, and combine existing skills into core services.

Are you looking for AIOps strategies or solutions? Register for our free webinar on July 30th entitled “AI Ops: Revolutionizing IT Management with Artificial Intelligence”

Register for this webinar

Source link