Score: 0

Large Language Models for Depression Recognition in Spoken Language Integrating Psychological Knowledge

Published: May 28, 2025 | arXiv ID: 2505.22863v1

By: Yupei Li , Shuaijie Shao , Manuel Milling and more

Potential Business Impact:

Helps computers detect sadness from voice and words.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Depression is a growing concern gaining attention in both public discourse and AI research. While deep neural networks (DNNs) have been used for recognition, they still lack real-world effectiveness. Large language models (LLMs) show strong potential but require domain-specific fine-tuning and struggle with non-textual cues. Since depression is often expressed through vocal tone and behaviour rather than explicit text, relying on language alone is insufficient. Diagnostic accuracy also suffers without incorporating psychological expertise. To address these limitations, we present, to the best of our knowledge, the first application of LLMs to multimodal depression detection using the DAIC-WOZ dataset. We extract the audio features using the pre-trained model Wav2Vec, and mapped it to text-based LLMs for further processing. We also propose a novel strategy for incorporating psychological knowledge into LLMs to enhance diagnostic performance, specifically using a question and answer set to grant authorised knowledge to LLMs. Our approach yields a notable improvement in both Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) compared to a base score proposed by the related original paper. The codes are available at https://github.com/myxp-lyp/Depression-detection.git

It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models

Multimedia

Helps computers spot depression from talking and faces.

25 Nov 2025 0

93%

It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models

Multimedia

Helps computers detect sadness from voices and faces.

25 Nov 2025 0

93%

Leveraging Large Language Models for Cost-Effective, Multilingual Depression Detection and Severity Assessment

Computation and Language

Helps find depression from what people write.

7 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇬🇧 United Kingdom

Page Count

13 pages

Large Language Models for Depression Recognition in Spoken Language Integrating Psychological Knowledge

Helps computers detect sadness from voice and words.

Technical Abstract

It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models

It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models

Leveraging Large Language Models for Cost-Effective, Multilingual Depression Detection and Severity Assessment