PublicAgent: Multi-Agent Design Principles From an LLM-Based Open Data Analysis Framework
By: Sina Montazeri, Yunhe Feng, Kewei Sha
Potential Business Impact:
Lets anyone ask questions of data without being a computer expert.
Open data repositories hold potential for evidence-based decision-making, yet are inaccessible to non-experts lacking expertise in dataset discovery, schema mapping, and statistical analysis. Large language models show promise for individual tasks, but end-to-end analytical workflows expose fundamental limitations: attention dilutes across growing contexts, specialized reasoning patterns interfere, and errors propagate undetected. We present PublicAgent, a multi-agent framework that addresses these limitations through decomposition into specialized agents for intent clarification, dataset discovery, analysis, and reporting. This architecture maintains focused attention within agent contexts and enables validation at each stage. Evaluation across five models and 50 queries derives five design principles for multi-agent LLM systems. First, specialization provides value independent of model strength--even the strongest model shows 97.5% agent win rates, with benefits orthogonal to model scale. Second, agents divide into universal (discovery, analysis) and conditional (report, intent) categories. Universal agents show consistent effectiveness (std dev 12.4%) while conditional agents vary by model (std dev 20.5%). Third, agents mitigate distinct failure modes--removing discovery or analysis causes catastrophic failures (243-280 instances), while removing report or intent causes quality degradation. Fourth, architectural benefits persist across task complexity with stable win rates (86-92% analysis, 84-94% discovery), indicating workflow management value rather than reasoning enhancement. Fifth, wide variance in agent effectiveness across models (42-96% for analysis) requires model-aware architecture design. These principles guide when and why specialization is necessary for complex analytical workflows while enabling broader access to public data through natural language interfaces.
Similar Papers
Transparent, Evaluable, and Accessible Data Agents: A Proof-of-Concept Framework
Artificial Intelligence
Lets anyone ask computers about business data.
1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
Artificial Intelligence
Keeps private talk secret when computers help.
Large Language Model-based Data Science Agent: A Survey
Artificial Intelligence
Lets computers help scientists analyze data.