XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows
By: Xinru Wang , Ming Yin , Eunyee Koh and more
As multi-agent systems powered by Large Language Models (LLMs) are increasingly adopted in real-world workflows, users with diverse technical backgrounds are now building and refining their own agentic processes. However, these systems can fail in opaque ways, making it difficult for users to observe, understand, and correct errors. We conducted formative interviews with 12 practitioners to identify mismatches between existing observability tools and users' needs. Based on these insights, we designed XAgen, an explainability tool that supports users with varying AI expertise through three core capabilities: log visualization for glanceable workflow understanding, human-in-the-loop feedback to capture expert judgment, and automatic error detection via an LLM-as-a-judge. In a user study with 8 participants, XAgen helped users more easily locate failures, attribute to specific agents or steps, and iteratively improve configurations. Our findings surface human-centered design guidelines for explainable agentic AI development and highlights opportunities for more context-aware interactive debugging.
Similar Papers
AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems?
Computation and Language
Fixes AI mistakes in complex robot teams.
AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems?
Computation and Language
Finds why AI "brains" make mistakes.
XAgents: A Unified Framework for Multi-Agent Cooperation via IF-THEN Rules and Multipolar Task Processing Graph
Artificial Intelligence
Helps AI teams solve tricky problems better.