# Artificial intelligence in the diagnosis and prognosis of ocular trauma: a systematic review

**Authors:** Zahra Abbasi Dolatabadi, Mahdi Nabi Foodani, Mohammad Fayyazi Farkhad, Faezeh Golvardi-Yazdi

PMC · DOI: 10.1186/s12886-026-04688-x · BMC Ophthalmology · 2026-03-14

## TL;DR

This systematic review explores how artificial intelligence can help diagnose and predict outcomes for eye injuries, finding that deep learning models show high accuracy.

## Contribution

The paper provides the first comprehensive synthesis of AI applications for diagnosing and predicting outcomes in ocular trauma.

## Key findings

- Deep learning models like DenseNet-169 and UNet achieved up to 96% diagnostic accuracy and AUC values of 0.99.
- ANNs outperformed traditional scoring systems in prognostic prediction with up to 93% accuracy.
- ChatGPT-4 showed 100% diagnostic accuracy but had 30% inconsistency in treatment decisions.

## Abstract

Ocular trauma is a leading cause of acquired monocular blindness worldwide, requiring prompt and accurate diagnosis and prognosis. While artificial intelligence (AI) has shown growing potential in medicine, a comprehensive synthesis of its applications in Diagnosis and Prognosis of Ocular Trauma is still lacking. Therefore, this systematic review aims to comprehensively synthesize the current evidence on the diagnostic and prognostic performance of artificial intelligence in ocular trauma.

This study is a systematic review conducted in accordance with the PRISMA 2020 guidelines. A comprehensive search was performed in PubMed, Scopus, and Web of Science from inception to May, 2025. Original studies using artificial intelligence (AI) for the diagnosis or prognosis of ocular trauma, applying human clinical data, and reporting performance metrics were included. Review articles and animal studies were excluded. Two reviewers independently screened the studies, extracted relevant data, and assessed the risk of bias using the PROBAST tool. Due to substantial heterogeneity among the included studies, a meta-analysis was not performed. This heterogeneity was primarily related to marked differences in outcome definitions, study designs, analytical models, and reported effect measures across studies. Specifically, the included studies employed diverse outcome metrics, used varying model types, and analyzed different data modalities, with effect estimates reported in non-comparable forms. As a result, quantitative pooling of data was not methodologically appropriate. Therefore, the findings were synthesized using a narrative approach. Heterogeneity was explored descriptively by examining the variability in reported AUC ranges across relevant subgroups, including outcome definitions, model types, and data modalities. Formal quantitative heterogeneity statistics were not calculated due to inconsistent reporting and insufficient data within subgroups.

A total of 112 records were identified across PubMed, Scopus, and Web of Science. After removing duplicates and screening titles and abstracts, 10 studies met the predefined inclusion criteria and were systematically analyzed. These studies investigated different AI techniques applied to ocular trauma. Across the included studies, deep learning models such as DenseNet-169 and UNet demonstrated high diagnostic accuracy (up to 96%) and Area under the curve (AUC) values of 0.99, while artificial neural networks (ANNs) outperformed traditional scoring systems like the Ocular Trauma Score (OTS) in prognostic prediction (accuracy up to 93%). ChatGPT-4 showed perfect diagnostic accuracy (100%) but a 30% inconsistency in treatment decisions. A narrative synthesis revealed that model performance varied by input data type, with image-based models generally outperforming those relying on clinical data alone.

The findings suggest that AI models, particularly deep learning and neural network architectures, can assist clinicians in rapid and accurate diagnosis, optimize triage decisions in emergency settings, and improve prognostic prediction of visual outcomes. Despite these promising results, challenges such as variability in data sources, lack of external validation, and ethical considerations persist. Future research should focus on multicenter validation, standardized datasets, and human–AI integration frameworks to facilitate clinical application.

CRD420251043768

## Full-text entities

- **Genes:** NINL (ninein like) [NCBI Gene 22981] {aka NLP}, MCC (MCC regulator of Wnt signaling pathway) [NCBI Gene 4163] {aka MCC1}, SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** monocular blindness (MESH:D001766), age-related macular degeneration (MESH:D008268), mandibular fracture (MESH:D008337), vision loss (MESH:D014786), eye injuries (MESH:D005131), jaw or facial fractures (MESH:D007572), LLMs (MESH:D007806), Blowout Fracture (MESH:D050723), Open Globe Injury (MESH:D006259), edema (MESH:D004487), Ocular Trauma (MESH:D014947), Traumatic Optic Neuropathy (MESH:D020221), hemorrhage (MESH:D006470), ophthalmic diseases (MESH:C535922), Pupillary Defect (MESH:D011681), AI (MESH:C538142), glaucoma (MESH:D005901), orbital fracture (MESH:D009917), structural abnormalities (MESH:C566527), Retinal Detachment (MESH:D012163), diabetic retinopathy (MESH:D003930), PROBAST (MESH:D004195)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12990504/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12990504/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12990504/full.md

---
Source: https://tomesphere.com/paper/PMC12990504