Score: 0

Leveraging Evidence-Guided LLMs to Enhance Trustworthy Depression Diagnosis

Published: November 22, 2025 | arXiv ID: 2511.17947v1

By: Yining Yuan , J. Ben Tamo , Micky C. Nnamdi and more

Potential Business Impact:

Helps AI doctors diagnose illnesses more accurately.

Business Areas:

Semantic Search Internet Services

Large language models (LLMs) show promise in automating clinical diagnosis, yet their non-transparent decision-making and limited alignment with diagnostic standards hinder trust and clinical adoption. We address this challenge by proposing a two-stage diagnostic framework that enhances transparency, trustworthiness, and reliability. First, we introduce Evidence-Guided Diagnostic Reasoning (EGDR), which guides LLMs to generate structured diagnostic hypotheses by interleaving evidence extraction with logical reasoning grounded in DSM-5 criteria. Second, we propose a Diagnosis Confidence Scoring (DCS) module that evaluates the factual accuracy and logical consistency of generated diagnoses through two interpretable metrics: the Knowledge Attribution Score (KAS) and the Logic Consistency Score (LCS). Evaluated on the D4 dataset with pseudo-labels, EGDR outperforms direct in-context prompting and Chain-of-Thought (CoT) across five LLMs. For instance, on OpenBioLLM, EGDR improves accuracy from 0.31 (Direct) to 0.76 and increases DCS from 0.50 to 0.67. On MedLlama, DCS rises from 0.58 (CoT) to 0.77. Overall, EGDR yields up to +45% accuracy and +36% DCS gains over baseline methods, offering a clinically grounded, interpretable foundation for trustworthy AI-assisted diagnosis.

Standardization of Psychiatric Diagnoses -- Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System

Artificial Intelligence

Helps doctors diagnose mental health better.

29 Oct 2025 0

91%

MDD-Thinker: Towards Large Reasoning Models for Major Depressive Disorder Diagnosis

Machine Learning (CS)

Helps doctors find depression faster and better.

29 Sep 2025 1

91%

Towards Robust and Fair Next Visit Diagnosis Prediction under Noisy Clinical Notes with Large Language Models

Computation and Language

Makes AI doctors more trustworthy with messy notes.

23 Nov 2025 2

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

7 pages

Leveraging Evidence-Guided LLMs to Enhance Trustworthy Depression Diagnosis

Helps AI doctors diagnose illnesses more accurately.

Technical Abstract

Standardization of Psychiatric Diagnoses -- Role of Fine-tuned LLM Consortium and OpenAI-gpt-oss Reasoning LLM Enabled Decision Support System

MDD-Thinker: Towards Large Reasoning Models for Major Depressive Disorder Diagnosis

Towards Robust and Fair Next Visit Diagnosis Prediction under Noisy Clinical Notes with Large Language Models