The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
By: Pengrui Han , Rafal Kocielnik , Peiyang Song and more
Potential Business Impact:
Computers can act like people, but don't always behave that way.
Personality traits have long been studied as predictors of human behavior.Recent advances in Large Language Models (LLMs) suggest similar patterns may emerge in artificial systems, with advanced LLMs displaying consistent behavioral tendencies resembling human traits like agreeableness and self-regulation. Understanding these patterns is crucial, yet prior work primarily relied on simplified self-reports and heuristic prompting, with little behavioral validation. In this study, we systematically characterize LLM personality across three dimensions: (1) the dynamic emergence and evolution of trait profiles throughout training stages; (2) the predictive validity of self-reported traits in behavioral tasks; and (3) the impact of targeted interventions, such as persona injection, on both self-reports and behavior. Our findings reveal that instructional alignment (e.g., RLHF, instruction tuning) significantly stabilizes trait expression and strengthens trait correlations in ways that mirror human data. However, these self-reported traits do not reliably predict behavior, and observed associations often diverge from human patterns. While persona injection successfully steers self-reports in the intended direction, it exerts little or inconsistent effect on actual behavior. By distinguishing surface-level trait expression from behavioral consistency, our findings challenge assumptions about LLM personality and underscore the need for deeper evaluation in alignment and interpretability.
Similar Papers
The Personality Illusion: Revealing Dissociation Between Self-Reports & Behavior in LLMs
Artificial Intelligence
Computers can act like people, but don't always behave that way.
Can LLMs Generate Behaviors for Embodied Virtual Agents Based on Personality Traits?
Human-Computer Interaction
Makes computer characters act like real people.
Evaluating LLM Alignment on Personality Inference from Real-World Interview Data
Computation and Language
Computers can't guess your personality from talking.