# A Novel Multimodal Implementation of a Foundation Artificial Intelligence Model Using Optic Nerve Head Fundus Photographs and OCT Imaging for Glaucoma Detection

**Authors:** Benton Chuter, Vedant Joshi, Shahin Hallaj, Evan Walker, Christopher Bowd, Akram Belghith, Michael H. Goldbaum, Andrzej Grzybowski, Massimo A. Fazio, Christopher A. Girkin, C. Gustavo De Moraes, Jeffrey M. Liebmann, Robert N. Weinreb, Linda M. Zangwill, Mark Christopher

PMC · DOI: 10.1016/j.xops.2025.101012 · Ophthalmology Science · 2025-11-17

## TL;DR

This study compares using single and combined eye images (fundus and OCT) with AI to detect glaucoma, finding that combining images improves accuracy but using OCT alone is nearly as good.

## Contribution

The novel multimodal implementation of RETFound for glaucoma detection using paired fundus and OCT images is evaluated for performance and generalizability.

## Key findings

- Multimodal model achieved an AUC of 0.94, outperforming the CFP unimodal model but not the OCT unimodal model.
- Precision and recall were higher for the multimodal model compared to the CFP model across all subgroups.
- Models performed better in detecting moderate-to-severe glaucoma than mild glaucoma.

## Abstract

To compare the performance of unimodal and multimodal implementation of the self-supervised learning model RETFound in detecting glaucoma using color fundus photographs (CFPs) and OCT images, and to assess its generalizability across different ethnicities, age groups, and disease severities.

Evaluation of a diagnostic technology.

Fourteen thousand five hundred ten CFPs and 32 640 OCTs from 1948 eyes of 1098 participants (60.8% glaucoma, 39.2% healthy) from the Diagnostic Innovations in Glaucoma Study and the African Descent and Glaucoma Evaluation Study were included. Glaucoma was defined as photograph-based glaucomatous optic neuropathy with or without repeatable glaucoma visual field damage.

A multimodal RETFound model was developed using paired CFPs and OCT images. The model was compared to unimodal RETFound models using solely CFP or OCT images. Performance was also stratified by race (Black vs. White), age (<60 vs. ≥60 years), and disease severity (mild vs. moderate-to-severe glaucoma).

Diagnostic accuracy of unimodal and multimodal RETFound models using CFP and OCT for detecting glaucoma was assessed using the area under the receiver operating characteristic curve (AUC), precision, and recall.

The multimodal model for glaucoma detection achieved an AUC of 0.94 (95% confidence interval: 0.91–0.97), significantly outperforming the CFP unimodal model (AUC 0.86 [95% confidence interval: 0.81–0.89], P < 0.001) but not the OCT unimodal model (AUC 0.93 [95% confidence interval: 0.90–0.96], P = 0.47). Precision and recall were higher (0.96 and 0.87, respectively) for the multimodal model compared with the CFP model (0.92 and 0.69) across all subgroups. No significant differences based on race or age were found in either unimodal or multimodal glaucoma detection models. All models exhibited better performance in detecting moderate-to-severe glaucoma than mild glaucoma, with significant differences in the unimodal CFP (P = 0.002) and OCT (P = 0.005) models.

The multimodal RETFound model demonstrated improved diagnostic ability compared with the CFP unimodal model but did not significantly outperform the OCT unimodal model in glaucoma detection. As clinical implementation of a unimodal artificial intelligence (AI) model is easier than a multimodal counterpart, our results suggest unimodal OCT AI models may be sufficient for detecting glaucoma.

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

## Linked entities

- **Diseases:** glaucoma (MONDO:0005041)

## Full-text entities

- **Genes:** CFP (complement factor properdin) [NCBI Gene 5199] {aka BFD, PFC, PFD, PROPERDIN}
- **Diseases:** glaucomatous optic neuropathy (MESH:D009901), Glaucoma (MESH:D005901)
- **Chemicals:** OCT (MESH:C051883)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12861149/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12861149/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/PMC12861149/full.md

---
Source: https://tomesphere.com/paper/PMC12861149