Leveraging Evidence-Guided LLMs to Enhance Trustworthy Depression Diagnosis

Yining Yuan; J. Ben Tamo; Micky C. Nnamdi; Yifei Wang; May D. Wang

arXiv:2511.17947·cs.AI·November 25, 2025

Leveraging Evidence-Guided LLMs to Enhance Trustworthy Depression Diagnosis

Yining Yuan, J. Ben Tamo, Micky C. Nnamdi, Yifei Wang, May D. Wang

PDF

Open Access

TL;DR

This paper introduces a two-stage framework using evidence-guided reasoning and confidence scoring to improve the transparency, trustworthiness, and accuracy of large language models in diagnosing depression.

Contribution

It presents a novel Evidence-Guided Diagnostic Reasoning and Confidence Scoring approach that enhances interpretability and reliability in clinical diagnosis with LLMs.

Findings

01

EGDR outperforms prompt-based methods with up to +45% accuracy.

02

DCS metrics improve reliability of diagnoses.

03

Framework demonstrates significant gains across multiple LLMs.

Abstract

Large language models (LLMs) show promise in automating clinical diagnosis, yet their non-transparent decision-making and limited alignment with diagnostic standards hinder trust and clinical adoption. We address this challenge by proposing a two-stage diagnostic framework that enhances transparency, trustworthiness, and reliability. First, we introduce Evidence-Guided Diagnostic Reasoning (EGDR), which guides LLMs to generate structured diagnostic hypotheses by interleaving evidence extraction with logical reasoning grounded in DSM-5 criteria. Second, we propose a Diagnosis Confidence Scoring (DCS) module that evaluates the factual accuracy and logical consistency of generated diagnoses through two interpretable metrics: the Knowledge Attribution Score (KAS) and the Logic Consistency Score (LCS). Evaluated on the D4 dataset with pseudo-labels, EGDR outperforms direct in-context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Mental Health via Writing · Artificial Intelligence in Healthcare and Education