A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction
By: Qing Wang , Zehan Li , Yaodong Song and more
Potential Business Impact:
Helps computers understand and respond to feelings.
This paper presents a unified spoken language model for emotional intelligence, enhanced by a novel data construction strategy termed Injected Emotional-Attribution Thinking (IEAT). IEAT incorporates user emotional states and their underlying causes into the model's internal reasoning process, enabling emotion-aware reasoning to be internalized rather than treated as explicit supervision. The model is trained with a two-stage progressive strategy. The first stage performs speech-text alignment and emotional attribute modeling via self-distillation, while the second stage conducts end-to-end cross-modal joint optimization to ensure consistency between textual and spoken emotional expressions. Experiments on the Human-like Spoken Dialogue Systems Challenge (HumDial) Emotional Intelligence benchmark demonstrate that the proposed approach achieves top-ranked performance across emotional trajectory modeling, emotional reasoning, and empathetic response generation under both LLM-based and human evaluations.
Similar Papers
From Emotion Classification to Emotional Reasoning: Enhancing Emotional Intelligence in Large Language Models
Computation and Language
Teaches AI to understand feelings better.
From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation
Computation and Language
Makes AI sound more happy and personal.
EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations
Computation and Language
Teaches computers to understand and respond to feelings.