# One Step Closer to Conversational Medical Records: ChatGPT Parses Psoriasis Treatments from EMRs

**Authors:** Jonathan Shapiro, Mor Atlas, Sharon Baum, Felix Pavlotsky, Aviv Barzilai, Rotem Gershon, Romi Gleicher, Itay Cohen

PMC · DOI: 10.3390/jcm14217845 · Journal of Clinical Medicine · 2025-11-05

## TL;DR

This study shows that ChatGPT can accurately extract psoriasis treatment details from medical records, performing nearly as well as expert dermatologists.

## Contribution

The paper introduces the use of ChatGPT-4o for structured extraction of psoriasis treatments from unstructured clinical notes in dermatology.

## Key findings

- ChatGPT-4o achieved high recall (0.91) and precision (0.96) in identifying psoriasis treatments from EMRs.
- The model showed excellent agreement with expert annotations (Cohen’s Kappa = 0.93; AUC = 0.98).
- Performance was highest for biologics and methotrexate (F1 = 1.00) and lower for categories with vague documentation.

## Abstract

Background: Large Language Models (LLMs), such as ChatGPT, are increasingly applied in medicine for summarization, clinical decision support, and diagnostic assistance, including recent work in dermatology. Previous AI and NLP models in dermatology have mainly focused on lesion classification, diagnostic support, and patient education, while extracting structured treatment information from unstructured dermatology records remains underexplored. We evaluated ChatGPT-4o’s ability to identify psoriasis treatments from free-text documentation, compared with expert annotations. Methods: In total, 94 electronic medical records (EMRs) of patients diagnosed with psoriasis were analyzed. ChatGPT-4o extracted treatments used for psoriasis from each unstructured clinical note. Its output was compared to manually curated reference annotations by expert dermatologists. A total of 83 treatments, including topical agents, systemic medications, biologics, phototherapy, and procedural interventions, were evaluated. Performance metrics included recall, precision, F1-score, specificity, accuracy, Cohen’s Kappa, and Area Under the Curve (AUC). Analyses were conducted at the individual-treatment level and grouped into pharmacologic categories. Results: ChatGPT-4o demonstrated strong performance, with recall of 0.91, precision of 0.96, F1-score of 0.94, specificity of 0.99, and accuracy of 0.99. Agreement with expert annotations was high (Cohen’s Kappa = 0.93; AUC = 0.98). Group-level analysis confirmed these results, with the highest performance in biologics and methotrexate (F1 = 1.00) and lower recall in categories with vague documentation, such as systemic corticosteroids and antihistamines. Conclusions: Our study highlights the potential of LLMs to extract psoriasis treatment information from unstructured clinical documentation and structure it for research and decision support. The model performed best with well-defined, commonly used treatments.

## Linked entities

- **Diseases:** psoriasis (MONDO:0005083)

## Full-text entities

- **Diseases:** Psoriasis (MESH:D011565)
- **Chemicals:** methotrexate (MESH:D008727)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12608328/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12608328/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/PMC12608328/full.md

---
Source: https://tomesphere.com/paper/PMC12608328