Suicide Risk Assessment Using Multimodal Speech Features: A Study on the SW1 Challenge Dataset
By: Ambre Marie , Ilias Maoudj , Guillaume Dardenne and more
Potential Business Impact:
Helps doctors find teens at risk of suicide.
The 1st SpeechWellness Challenge conveys the need for speech-based suicide risk assessment in adolescents. This study investigates a multimodal approach for this challenge, integrating automatic transcription with WhisperX, linguistic embeddings from Chinese RoBERTa, and audio embeddings from WavLM. Additionally, handcrafted acoustic features -- including MFCCs, spectral contrast, and pitch-related statistics -- were incorporated. We explored three fusion strategies: early concatenation, modality-specific processing, and weighted attention with mixup regularization. Results show that weighted attention provided the best generalization, achieving 69% accuracy on the development set, though a performance gap between development and test sets highlights generalization challenges. Our findings, strictly tied to the MINI-KID framework, emphasize the importance of refining embedding representations and fusion mechanisms to enhance classification reliability.
Similar Papers
In-context learning capabilities of Large Language Models to detect suicide risk among adolescents from speech transcripts
Audio and Speech Processing
Helps find teens at risk of suicide by listening.
Leveraging Large Language Models for Spontaneous Speech-Based Suicide Risk Detection
Sound
Listens to voices to find teens at risk.
Dynamic Fusion Multimodal Network for SpeechWellness Detection
Sound
Helps find kids at risk of suicide.