The Social Cost of Intelligence: Emergence, Propagation, and Amplification of Stereotypical Bias in Multi-Agent Systems
By: Thi-Nhung Nguyen , Linhao Luo , Thuy-Trang Vu and more
Potential Business Impact:
Helps AI teams avoid unfairness when working together.
Bias in large language models (LLMs) remains a persistent challenge, manifesting in stereotyping and unfair treatment across social groups. While prior research has primarily focused on individual models, the rise of multi-agent systems (MAS), where multiple LLMs collaborate and communicate, introduces new and largely unexplored dynamics in bias emergence and propagation. In this work, we present a comprehensive study of stereotypical bias in MAS, examining how internal specialization, underlying LLMs and inter-agent communication protocols influence bias robustness, propagation, and amplification. We simulate social contexts where agents represent different social groups and evaluate system behavior under various interaction and adversarial scenarios. Experiments on three bias benchmarks reveal that MAS are generally less robust than single-agent systems, with bias often emerging early through in-group favoritism. However, cooperative and debate-based communication can mitigate bias amplification, while more robust underlying LLMs improve overall system stability. Our findings highlight critical factors shaping fairness and resilience in multi-agent LLM systems.
Similar Papers
Your AI Bosses Are Still Prejudiced: The Emergence of Stereotypes in LLM-Based Multi-Agent Systems
Computation and Language
AI agents learn unfair ideas from talking to each other.
From Single to Societal: Analyzing Persona-Induced Bias in Multi-Agent Interactions
Multiagent Systems
AI agents show unfair bias based on fake personalities.
An Empirical Study of Group Conformity in Multi-Agent Systems
Artificial Intelligence
AI debates can make opinions change like people.