Systematic Evaluation of Machine-Generated Reasoning and PHQ-9 Labeling for Depression Detection Using Large Language Models

Zongru Shao; Xin Wang; Zhanyang Liu; Chenhan Wang; K.P. Subbalakshmi

arXiv:2505.17119·cs.CL·May 26, 2025

Systematic Evaluation of Machine-Generated Reasoning and PHQ-9 Labeling for Depression Detection Using Large Language Models

Zongru Shao, Xin Wang, Zhanyang Liu, Chenhan Wang, K.P. Subbalakshmi

PDF

TL;DR

This paper systematically evaluates large language models' reasoning in depression detection, revealing strengths in explicit language analysis and proposing optimization strategies like DPO for improved accuracy.

Contribution

It introduces a comprehensive framework for analyzing LLM reasoning in depression detection and explores optimization methods, notably DPO, to enhance performance.

Findings

01

LLMs are more accurate with explicit depression language

02

DPO significantly improves detection performance

03

Human verification highlights strengths and weaknesses in LLM reasoning

Abstract

Recent research leverages large language models (LLMs) for early mental health detection, such as depression, often optimized with machine-generated data. However, their detection may be subject to unknown weaknesses. Meanwhile, quality control has not been applied to these generated corpora besides limited human verifications. Our goal is to systematically evaluate LLM reasoning and reveal potential weaknesses. To this end, we first provide a systematic evaluation of the reasoning over machine-generated detection and interpretation. Then we use the models' reasoning abilities to explore mitigation strategies for enhanced performance. Specifically, we do the following: A. Design an LLM instruction strategy that allows for systematic analysis of the detection by breaking down the task into several subtasks. B. Design contrastive few-shot and chain-of-thought prompts by selecting typical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.