Score: 0

Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection

Published: December 21, 2025 | arXiv ID: 2512.18733v1

By: Junjun Pan , Yixin Liu , Rui Miao and more

Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks. As MAS become increasingly autonomous in various safety-critical tasks, detecting malicious agents has become a critical security concern. Although existing graph anomaly detection (GAD)-based defenses can identify anomalous agents, they mainly rely on coarse sentence-level information and overlook fine-grained lexical cues, leading to suboptimal performance. Moreover, the lack of interpretability in these methods limits their reliability and real-world applicability. To address these limitations, we propose XG-Guard, an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS. To incorporate both coarse and fine-grained textual information for anomalous agent identification, we utilize a bi-level agent encoder to jointly model the sentence- and token-level representations of each agent. A theme-based anomaly detector further captures the evolving discussion focus in MAS dialogues, while a bi-level score fusion mechanism quantifies token-level contributions for explanation. Extensive experiments across diverse MAS topologies and attack scenarios demonstrate robust detection performance and strong interpretability of XG-Guard.

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Cryptography and Security

Protects smart AI teams from bad communication.

22 Oct 2025 1

88%

Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

Cryptography and Security

Makes AI-built apps safer from hidden computer tricks.

23 Nov 2025 1

88%

Toward a Safe Internet of Agents

Multiagent Systems

Makes AI agents safer and more trustworthy.

29 Nov 2025 0

View PDF Login to Bookmark

Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection

Technical Abstract

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

Toward a Safe Internet of Agents