Semantic Differentiation in Speech Emotion Recognition: Insights from Descriptive and Expressive Speech Roles
By: Rongchen Guo , Vincent Francoeur , Isar Nejadgholi and more
Potential Business Impact:
Helps computers understand your feelings in speech.
Speech Emotion Recognition (SER) is essential for improving human-computer interaction, yet its accuracy remains constrained by the complexity of emotional nuances in speech. In this study, we distinguish between descriptive semantics, which represents the contextual content of speech, and expressive semantics, which reflects the speaker's emotional state. After watching emotionally charged movie segments, we recorded audio clips of participants describing their experiences, along with the intended emotion tags for each clip, participants' self-rated emotional responses, and their valence/arousal scores. Through experiments, we show that descriptive semantics align with intended emotions, while expressive semantics correlate with evoked emotions. Our findings inform SER applications in human-AI interaction and pave the way for more context-aware AI systems.
Similar Papers
Speech Emotion Recognition with Phonation Excitation Information and Articulatory Kinematics
Sound
Reads emotions from how someone talks and moves their mouth.
Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
Machine Learning (CS)
Shows why computers understand emotions in voices.
Amplifying Emotional Signals: Data-Efficient Deep Learning for Robust Speech Emotion Recognition
Audio and Speech Processing
Helps computers understand your feelings from your voice.