# Early prediction of diabetic retinopathy using a multimodal deep learning framework integrating fundus and OCT imaging

**Authors:** Abdel-Hamid M. Emara, Jawad Hasan Alkhateeb, Ghada Atteia, Aiman Turani, Jamal Zraqou, Zeinab Elsawaf, Abid Jameel

PMC · DOI: 10.3389/fmed.2025.1741146 · Frontiers in Medicine · 2026-01-09

## TL;DR

This study introduces a deep learning framework that combines fundus and OCT images to improve early detection of diabetic retinopathy.

## Contribution

A novel multimodal deep learning framework that integrates fundus and OCT imaging for early diabetic retinopathy prediction.

## Key findings

- The multimodal framework achieved 90.5% accuracy and 0.970 AUC-ROC on a curated dataset of 222 paired images.
- Attention-based feature fusion emphasizes diagnostically relevant regions across modalities.
- Results suggest potential for AI-assisted early DR screening but require validation on larger datasets.

## Abstract

Diabetic Retinopathy (DR) remains a leading cause of preventable vision impairment among individuals with diabetes, particularly when not identified in its early stages. Conventional diagnostic techniques typically employ either fundus photography or Optical Coherence Tomography (OCT), with each modality offering distinct yet partial insights into retinal abnormalities. This study proposes a multimodal diagnostic framework that fuses both structural and spatial retinal characteristics through the integration of fundus and OCT imagery. We utilize a curated subset of 222 high- quality, modality- paired images (111 fundus + 111 OCT), selected from a larger publicly available dataset based on strict inclusion criteria including image clarity, diagnostic labeling, and modality alignment. Feature extraction pipelines are optimized for each modality to capture relevant pathological markers, and the extracted features are fused using an attention- based weighting mechanism that emphasizes diagnostically salient regions across modalities. The proposed approach achieves an accuracy of 90.5% and an AUC- ROC of 0.970 on this curated subset, indicating promising feasibility of multimodal fusion for early- stage DR assessment. Given the limited dataset size, these results should be interpreted as preliminary, demonstrating methodological potential rather than large- scale robustness. The study highlights the clinical value of hybrid imaging frameworks and AI- assisted screening tools, while emphasizing the need for future validation on larger and more diverse datasets.

## Linked entities

- **Diseases:** Diabetic Retinopathy (MONDO:0005266)

## Full-text entities

- **Diseases:** DR (MESH:D003930), diabetes (MESH:D003920), retinal abnormalities (MESH:D012164), vision impairment (MESH:D014786)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12829329/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12829329/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/PMC12829329/full.md

---
Source: https://tomesphere.com/paper/PMC12829329