Exploring Machine Learning and Language Models for Multimodal Depression Detection
By: Javier Si Zhao Hong , Timothy Zoe Delaya , Sherwyn Chan Yin Kit and more
Potential Business Impact:
Finds sadness in voices, faces, and words.
This paper presents our approach to the first Multimodal Personality-Aware Depression Detection Challenge, focusing on multimodal depression detection using machine learning and deep learning models. We explore and compare the performance of XGBoost, transformer-based architectures, and large language models (LLMs) on audio, video, and text features. Our results highlight the strengths and limitations of each type of model in capturing depression-related signals across modalities, offering insights into effective multimodal representation strategies for mental health prediction.
Similar Papers
It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models
Multimedia
Helps computers detect sadness from voices and faces.
Personality-Enhanced Multimodal Depression Detection in the Elderly
Sound
Helps doctors find depression in old people.
The First MPDD Challenge: Multimodal Personality-aware Depression Detection
Artificial Intelligence
Helps doctors find depression in all ages.