EmoNews: A Spoken Dialogue System for Expressive News Conversations
By: Ryuki Matsuura , Shikhar Bharadwaj , Jiarui Liu and more
Potential Business Impact:
Makes talking computers sound more caring.
We develop a task-oriented spoken dialogue system (SDS) that regulates emotional speech based on contextual cues to enable more empathetic news conversations. Despite advancements in emotional text-to-speech (TTS) techniques, task-oriented emotional SDSs remain underexplored due to the compartmentalized nature of SDS and emotional TTS research, as well as the lack of standardized evaluation metrics for social goals. We address these challenges by developing an emotional SDS for news conversations that utilizes a large language model (LLM)-based sentiment analyzer to identify appropriate emotions and PromptTTS to synthesize context-appropriate emotional speech. We also propose subjective evaluation scale for emotional SDSs and judge the emotion regulation performance of the proposed and baseline systems. Experiments showed that our emotional SDS outperformed a baseline system in terms of the emotion regulation and engagement. These results suggest the critical role of speech emotion for more engaging conversations. All our source code is open-sourced at https://github.com/dhatchi711/espnet-emotional-news/tree/emo-sds/egs2/emo_news_sds/sds1
Similar Papers
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting
Audio and Speech Processing
Makes talking robots sound happy or sad.
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Computation and Language
Lets you easily test and compare talking computer programs.
EmoTale: An Enacted Speech-emotion Dataset in Danish
Computation and Language
Helps computers understand Danish emotions in speech.