Score: 1

Exploring Machine Learning and Language Models for Multimodal Depression Detection

Published: August 28, 2025 | arXiv ID: 2508.20805v1

By: Javier Si Zhao Hong , Timothy Zoe Delaya , Sherwyn Chan Yin Kit and more

Potential Business Impact:

Finds sadness in voices, faces, and words.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This paper presents our approach to the first Multimodal Personality-Aware Depression Detection Challenge, focusing on multimodal depression detection using machine learning and deep learning models. We explore and compare the performance of XGBoost, transformer-based architectures, and large language models (LLMs) on audio, video, and text features. Our results highlight the strengths and limitations of each type of model in capturing depression-related signals across modalities, offering insights into effective multimodal representation strategies for mental health prediction.

Country of Origin
πŸ‡ΈπŸ‡¬ Singapore


Page Count
6 pages

Category
Computer Science:
Computation and Language