# Recovery in personality disorders: the development and preliminary testing of a novel natural language processing model to identify recovery in mental health electronic records

**Authors:** Giouliana Kadra-Scalzo, Jaya Chaturvedi, Oliver Dale, Richard D. Hayes, Lifang Li, Shaza Mahmood, Jonathan Monk-Cunliffe, Angus Roberts, Paul Moran

PMC · DOI: 10.3389/fdgth.2025.1544781 · Frontiers in Digital Health · 2025-04-03

## TL;DR

This paper introduces a new natural language processing model to detect recovery in mental health records, focusing on personality disorders.

## Contribution

A novel NLP model was developed to identify recovery in electronic health records for personality disorders.

## Key findings

- NLP models for ADL recovery achieved a precision of 0.80, outperforming models for occupational recovery.
- The models missed at least 50% of individuals who had recovered, indicating room for improvement.
- It is feasible to use NLP for identifying recovery domains in personality disorder diagnoses.

## Abstract

The concept of recovery is of great importance in mental health as it emphasizes improvements in quality of life and functioning alongside the traditional focus on symptomatic remission. Yet, investigating non-symptomatic recovery in the field of personality disorders has been particularly challenging due to complexities in capturing the occurrence of recovery. Electronic health records (EHRs) provide a robust platform from which episodes of recovery can be detected. However, much of the relevant information may be embedded in free-text clinical notes, requiring the development of appropriate tools to extract these data.

Using data from one of Europe's largest electronic health records databases [the Clinical Records Interactive Search (CRIS)], we developed and evaluated natural language processing (NLP) models for the identification of occupational and activities of daily living (ADL) recovery among individuals diagnosed with personality disorder.

The models on ADL performed better (precision: 0.80; 95% CI: 0.73–0.84) than those on occupational recovery (precision: 0.62; 95%CI: 0.52–0.72). However, the models performed less acceptably in correctly identifying all those who recovered, generally missing at least 50% of the population of those who had recovered.

It is feasible to develop NLP models for the identification of recovery domains for individuals with a diagnosis of personality disorder. Future research needs to improve the efficiency of pre-processing strategies to handle long clinical documents.

## Linked entities

- **Diseases:** personality disorder (MONDO:0002028)

## Full-text entities

- **Diseases:** personality disorder (MESH:D010554)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12003297/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12003297/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12003297/full.md

---
Source: https://tomesphere.com/paper/PMC12003297