# Exploring AI’s Potential in Papilledema Diagnosis to Support Dermatological Treatment Decisions in Rural Healthcare

**Authors:** Jonathan Shapiro, Mor Atlas, Naomi Fridman, Itay Cohen, Ziad Khamaysi, Mahdi Awwad, Naomi Silverstein, Tom Kozlovsky, Idit Maharshak

PMC · DOI: 10.3390/diagnostics15192547 · Diagnostics · 2025-10-09

## TL;DR

This study compares ChatGPT-4o's ability to detect papilledema in eye images with a specialized AI model and human experts, finding it moderately accurate but less reliable in rural healthcare settings.

## Contribution

The study evaluates ChatGPT-4o's performance in papilledema detection, comparing it with a domain-specific CNN and human experts in a rural healthcare context.

## Key findings

- ChatGPT-4o achieved 85.9% accuracy in papilledema detection but had lower specificity and PPV than a ResNet model and human experts.
- The ResNet model reached 99.5% accuracy, outperforming both ChatGPT-4o and human ophthalmologists.
- ChatGPT-4o showed potential as an accessible AI tool for underserved areas despite its lower diagnostic precision.

## Abstract

Background: Papilledema, an ophthalmic finding associated with increased intracranial pressure, is often induced by dermatological medications, including corticosteroids, isotretinoin, and tetracyclines. Early detection is crucial for preventing irreversible optic nerve damage, but access to ophthalmologic expertise is often limited in rural settings. Artificial intelligence (AI) may enable the automated and accurate detection of papilledema from fundus images, thereby supporting timely diagnosis and management. Objective: The primary objective of this study was to explore the diagnostic capability of ChatGPT-4o, a general large language model with multimodal input, in identifying papilledema from fundus photographs. For context, its performance was compared with a ResNet-based convolutional neural network (CNN) specifically fine-tuned for ophthalmic imaging, as well as with the assessments of two human ophthalmologists. The focus was on applications relevant to dermatological care in resource-limited environments. Methods: A dataset of 1094 fundus images (295 papilledema, 799 normal) was preprocessed and partitioned into a training set and a test set. The ResNet model was fine-tuned using discriminative learning rates and a one-cycle learning rate policy. GPT-4o and two human evaluators (a senior ophthalmologist and an ophthalmology resident) independently assessed the test images. Diagnostic metrics including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and Cohen’s Kappa, were calculated for each evaluator. Results: GPT-4o, when applied to papilledema detection, achieved an overall accuracy of 85.9% with substantial agreement beyond chance (Cohen’s Kappa = 0.72), but lower specificity (78.9%) and positive predictive value (73.7%) compared to benchmark models. For context, the ResNet model, fine-tuned for ophthalmic imaging, reached near-perfect accuracy (99.5%, Kappa = 0.99), while two human ophthalmologists achieved accuracies of 96.0% (Kappa ≈ 0.92). Conclusions: This study explored the capability of GPT-4o, a large language model with multimodal input, for detecting papilledema from fundus photographs. GPT-4o achieved moderate diagnostic accuracy and substantial agreement with the ground truth, but it underperformed compared to both a domain-specific ResNet model and human ophthalmologists. These findings underscore the distinction between generalist large language models and specialized diagnostic AI: while GPT-4o is not optimized for ophthalmic imaging, its accessibility, adaptability, and rapid evolution highlight its potential as a future adjunct in clinical screening, particularly in underserved settings. These findings also underscore the need for validation on external datasets and real-world clinical environments before such tools can be broadly implemented.

## Linked entities

- **Chemicals:** isotretinoin (PubChem CID 5282379)
- **Diseases:** papilledema (MONDO:0006879)

## Full-text entities

- **Diseases:** optic nerve damage (MESH:D020221), increased intracranial pressure (MESH:D019586), Papilledema (MESH:D010211)
- **Chemicals:** isotretinoin (MESH:D015474), tetracyclines (MESH:D013754)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12523928/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12523928/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12523928/full.md

---
Source: https://tomesphere.com/paper/PMC12523928