Emotional Support Evaluation Framework via Controllable and Diverse Seeker Simulator
By: Chaewon Heo, Cheyon Jin, Yohan Jo
As emotional support chatbots have recently gained significant traction across both research and industry, a common evaluation strategy has emerged: use help-seeker simulators to interact with supporter chatbots. However, current simulators suffer from two critical limitations: (1) they fail to capture the behavioral diversity of real-world seekers, often portraying them as overly cooperative, and (2) they lack the controllability required to simulate specific seeker profiles. To address these challenges, we present a controllable seeker simulator driven by nine psychological and linguistic features that underpin seeker behavior. Using authentic Reddit conversations, we train our model via a Mixture-of-Experts (MoE) architecture, which effectively differentiates diverse seeker behaviors into specialized parameter subspaces, thereby enhancing fine-grained controllability. Our simulator achieves superior profile adherence and behavioral diversity compared to existing approaches. Furthermore, evaluating 7 prominent supporter models with our system uncovers previously obscured performance degradations. These findings underscore the utility of our framework in providing a more faithful and stress-tested evaluation for emotional support chatbots.
Similar Papers
COMPEER: Controllable Empathetic Reinforcement Reasoning for Emotional Support Conversation
Computation and Language
Helps computers give better emotional support.
EmoHarbor: Evaluating Personalized Emotional Support by Simulating the User's Internal World
Computation and Language
Helps computers give better, personal comfort.
Reinforcing Trustworthiness in Multimodal Emotional Support Systems
Computers and Society
Helps computers give better emotional support.