Using Large Language Models to Measure Symptom Severity in Patients At Risk for Schizophrenia
Andrew X. Chen, Guillermo Horga, Sean Escola

TL;DR
This study demonstrates that large language models can accurately predict symptom severity scores from clinical interviews in high-risk schizophrenia patients, potentially streamlining and standardizing assessments.
Contribution
The paper introduces a novel application of LLMs to predict BPRS scores from unstructured clinical interview transcripts, achieving near-human reliability without structured interviews.
Findings
LLMs achieved median concordance of 0.84 with true BPRS scores.
Performance was comparable in foreign languages with median concordance of 0.88.
LLMs can incorporate longitudinal data for improved assessment.
Abstract
Patients who are at clinical high risk (CHR) for schizophrenia need close monitoring of their symptoms to inform appropriate treatments. The Brief Psychiatric Rating Scale (BPRS) is a validated, commonly used research tool for measuring symptoms in patients with schizophrenia and other psychotic disorders; however, it is not commonly used in clinical practice as it requires a lengthy structured interview. Here, we utilize large language models (LLMs) to predict BPRS scores from clinical interview transcripts in 409 CHR patients from the Accelerating Medicines Partnership Schizophrenia (AMP-SCZ) cohort. Despite the interviews not being specifically structured to measure the BPRS, the zero-shot performance of the LLM predictions compared to the true assessment (median concordance: 0.84, ICC: 0.73) approaches human inter- and intra-rater reliability. We further demonstrate that LLMs have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
