Score: 1

The Multi-Agent Fault Localization System Based on Monte Carlo Tree Search Approach

Published: July 30, 2025 | arXiv ID: 2507.22800v1

By: Rui Ren

Potential Business Impact:

Finds computer problems faster and more accurately.

Business Areas:

Simulation Software

In real-world scenarios, due to the highly decoupled and flexible nature of microservices, it poses greater challenges to system reliability. The more frequent occurrence of incidents has created a demand for Root Cause Analysis(RCA) methods that enable rapid identification and recovery of incidents. Large language model (LLM) provides a new path for quickly locating and recovering from incidents by leveraging their powerful generalization ability combined with expert experience. Current LLM for RCA frameworks are based on ideas like ReAct and Chain-of-Thought, but the hallucination of LLM and the propagation nature of anomalies often lead to incorrect localization results. Moreover, the massive amount of anomalous information generated in large, complex systems presents a huge challenge for the context window length of LLMs. To address these challenges, we propose KnowledgeMind, an innovative LLM multi-agent system based on Monte Carlo Tree Search and a knowledge base reward mechanism for standardized service-by-service reasoning. Compared to State-Of-The-Art(SOTA) LLM for RCA methods, our service-by-service exploration approach significantly reduces the burden on the maximum context window length, requiring only one-tenth of its size. Additionally, by incorporating a rule-based real-time reward mechanism, our method effectively mitigates hallucinations during the inference process. Compared to the SOTA LLM for RCA framework, our method achieves a 49.29% to 128.35% improvement in root cause localization accuracy.

TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems

Artificial Intelligence

Fixes computer problems automatically by understanding clues.

29 Apr 2025 0

90%

MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents

Artificial Intelligence

Finds computer problems faster by reading logs.

19 Sep 2025 1

90%

Reasoning Language Models for Root Cause Analysis in 5G Wireless Networks

Artificial Intelligence

Fixes phone network problems faster using smart AI.

29 Jul 2025 2

View PDF Login to Bookmark

Page Count

12 pages

The Multi-Agent Fault Localization System Based on Monte Carlo Tree Search Approach

Finds computer problems faster and more accurately.

Technical Abstract

TAMO:Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems

MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents

Reasoning Language Models for Root Cause Analysis in 5G Wireless Networks