Score: 1

Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation

Published: September 3, 2025 | arXiv ID: 2509.03736v1

By: James Mooney , Josef Woldense , Zheng Robert Jia and more

Potential Business Impact:

AI can't reliably act like people in studies.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The impressive capabilities of Large Language Models (LLMs) have fueled the notion that synthetic agents can serve as substitutes for real participants in human-subject research. In an effort to evaluate the merits of this claim, social science researchers have largely focused on whether LLM-generated survey data corresponds to that of a human counterpart whom the LLM is prompted to represent. In contrast, we address a more fundamental question: Do agents maintain internal consistency, retaining similar behaviors when examined under different experimental settings? To this end, we develop a study designed to (a) reveal the agent's internal state and (b) examine agent behavior in a basic dialogue setting. This design enables us to explore a set of behavioral hypotheses to assess whether an agent's conversation behavior is consistent with what we would expect from their revealed internal state. Our findings on these hypotheses show significant internal inconsistencies in LLMs across model families and at differing model sizes. Most importantly, we find that, although agents may generate responses matching those of their human counterparts, they fail to be internally consistent, representing a critical gap in their capabilities to accurately substitute for real participants in human-subject research. Our simulation code and data are publicly accessible.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
25 pages

Category
Computer Science:
Artificial Intelligence