# Using large language models as a scalable mental status evaluation technique

**Authors:** Margot Wagner, Callum Stephenson, Jasleen Jagayat, Anchan Kumar, Amir Shirazi, Nazanin Alavi, Mohsen Omrani

PMC · DOI: 10.1038/s44277-025-00042-z · NPP - Digital Psychiatry and Neuroscience · 2025-11-13

## TL;DR

This study shows that a RoBERTa-based language model can identify signs of anxiety and depression from text, offering a scalable tool for mental health evaluation.

## Contribution

The study introduces a RoBERTa-based model for mental status evaluation using only text, achieving accuracy comparable to human experts.

## Key findings

- The model achieved 74% accuracy in identifying anxiety and depression symptoms from text.
- Performance was comparable to clinical evaluations despite lacking prosodic or visual cues.
- Text augmentation and hyperparameter tuning improved model effectiveness.

## Abstract

Mental health care faces a significant gap in service availability, with demand for services significantly surpassing available care. As such, building scalable and objective measurement tools for mental health evaluation is of primary concern. Given the usage of spoken language in diagnostics and treatment, it stands out as a potential methodology. With a substantial mismatch between the demand for services and the availability of care, this study focuses on leveraging large language models to bridge this gap. Here, a RoBERTa-based transformer model is fine-tuned for mental health status evaluation using natural language processing. The model analyzes written language without access to prosodic, motor, or visual cues commonly used in clinical mental status exams. Using non-clinical data from online forums and clinical data from a board-reviewed online psychotherapy trial, this study provides preliminary evidence that large language models can support symptom identification in classifying sentences with an accuracy comparable to human experts. The text dataset is expanded through augmentation using backtranslation and the model performance is optimized through hyperparameter tuning. Specifically, a RoBERTa-based model is fine-tuned on psychotherapy session text to predict whether individual sentences are symptomatic of anxiety or depression with prediction accuracy on par with clinical evaluations at 74%.

Mental health services struggle to keep up with growing demand. This study explores how a RoBERTa large language model can help by analyzing text for signs of anxiety or depression. It uses online therapy transcripts and forum posts, learning to spot symptoms from what people write, even without tone or facial cues. With accuracy similar to human experts, this approach shows promise for making mental health evaluation more accessible and scalable.

## Linked entities

- **Diseases:** anxiety (MONDO:0005618), depression (MONDO:0002050)

## Full-text entities

- **Diseases:** depression (MESH:D003866), anxiety (MESH:D001007)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12624874/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12624874/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC12624874/full.md

---
Source: https://tomesphere.com/paper/PMC12624874