A Systematic Evaluation of Large Language Models for PTSD Severity Estimation: The Role of Contextual Knowledge and Modeling Strategies
Panagiotis Kaliosis, Adithya V Ganesan, Oscar N.E. Kjell, Whitney Ringwald, Scott Feltman, Melissa A. Carr, Dimitris Samaras, Camilo Ruggero, Benjamin J. Luft, Roman Kotov, Andrew H. Schwartz

TL;DR
This study systematically evaluates how different contextual and modeling strategies impact the accuracy of large language models in estimating PTSD severity from narratives, highlighting the importance of detailed context and ensemble methods.
Contribution
It provides a comprehensive analysis of factors influencing LLM performance in mental health assessment, including context, reasoning, and ensemble strategies, across multiple models.
Findings
Detailed construct definitions improve accuracy.
Increased reasoning effort enhances estimation.
Ensembling models yields best performance.
Abstract
Large language models (LLMs) are increasingly being used in a zero-shot fashion to assess mental health conditions, yet we have limited knowledge on what factors affect their accuracy. In this study, we utilize a clinical dataset of natural language narratives and self-reported PTSD severity scores from 1,437 individuals to comprehensively evaluate the performance of 11 state-of-the-art LLMs. To understand the factors affecting accuracy, we systematically varied (i) contextual knowledge like subscale definitions, distribution summary, and interview questions, and (ii) modeling strategies including zero-shot vs few shot, amount of reasoning effort, model sizes, structured subscales vs direct scalar prediction, output rescaling and nine ensemble methods. Our findings indicate that (a) LLMs are most accurate when provided with detailed construct definitions and context of the narrative;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health via Writing · Posttraumatic Stress Disorder Research · Machine Learning in Healthcare
