In recent months, generative AI has fully entered the promises of the observability market. Tools claim to explain logs, automatically create dashboards, or find root causes through natural language queries. It sounds great — but what actually delivers value in production? And what is still more marketing than applicable technology?
In this article, we set the hype aside and analyze what already works in practice and what is still stuck in scripted demos.
Where Generative AI Truly Delivers Value
Generative AI — especially LLMs like GPT, Claude, or Gemini — can be an important ally in accelerating technical reasoning. But its strength doesn’t lie in making decisions or resolving incidents on its own; rather, it excels at organizing scattered information and suggesting paths for investigation.
Explanatory Analysis of Logs and Traces
This is perhaps the most concrete application so far. LLMs are good at summarizing large volumes of verbose logs, highlighting patterns and anomalous behaviors. Especially useful when:
- The environment is distributed, with multiple services and languages;
- The logs are poorly documented or generated by various frameworks;
- Response time is critical, such as in production incident situations
AI can highlight patterns such as “recurring timeouts in the authentication service” or “parsing failures on the /checkout endpoint.” With sufficient context — such as event origin, stack trace, and history — the model helps reduce the time to the first insight.
Assisted Generation of Dashboards and Queries
Platforms like Datadog, New Relic, and Dynatrace are already testing query copilots. The idea is simple: you type something like “I want to see the average latency by region in the last 30 minutes,” and the system suggests a working chart.
For those who don’t master PromQL, NRQL, or SQL, this lowers the learning curve. But it still requires review. AI can misname metrics, apply incorrect filters, or suggest less useful visualizations. It serves as a starting point, not as the final step.
Event Correlation
Another promising use is suggesting correlations. AI analyzes multiple signals (CPU, memory, latency, logs) and proposes temporal relationships between symptoms. For example:
“The Redis performance drop began swapping at 3:32 PM, coinciding with the latency increase in the payment service.”
This approach doesn’t replace a root cause analysis but helps put the puzzle together more quickly.
What Is Still Buzzword (or Almost)
Much of what is sold as “generative AI” still relies on highly controlled contexts, predefined scripts, and manual curation. Caution is advised.
Infallible and Autonomous Diagnosis
There is still no model capable of delivering consistent diagnoses without deep knowledge of the environment. Variables such as inconsistent naming, noisy logs, or undocumented topologies directly affect AI performance. In real environments, interpretation errors are frequent.
Automated Incident Resolution
Some promises sound like fiction: “AI detects and fixes automatically.” The truth is that autonomous action is only possible with pre-programmed automations and in very well-mapped scenarios. AI can suggest commands, yes — but it rarely executes them safely, and almost never with permission.
Replacement of Technical Teams
Despite the talk, generative AI does not replace observability engineers, SREs, or infrastructure analysts. Its value lies in enhancing the team’s analytical capacity, not removing them from the process. Human judgment remains irreplaceable.
Pragmatic Paths to Adopt Generative AI in Observability
Below are some recommendations for incorporating generative AI without falling into traps:
- Start with the copilot, not the pilot: use AI to suggest, review, and organize — not to automate critical decisions right from the start.
- Provide context: model performance improves when there is structured data, incident history, and documented routines.
- Prioritize support and documentation tasks: incident summaries, report generation, and log explanations are good starting points.
- Always evaluate in a real environment: many features work well in demos but don’t hold up under production noise.
At dbsnoop, our focus is on separating hype from technical utility. We closely follow the use of generative AI in observability with a critical eye and based on practical testing. Keep following the blog for more technical analyses, with no shortcuts or empty promises.
Visit our YouTube channel to learn about the platform and watch tutorials.
Schedule a demo here.
Learn more about Flightdeck!
Learn about database monitoring with advanced tools here.