# Longitudinal Forecasting of Retinal Structure and Function Using a Multimodal StyleGAN-Based Architecture

**Authors:** Arunodhayan Sampathkumar, Danny Kowerko

PMC · DOI: 10.3390/bioengineering13020149 · Bioengineering · 2026-01-28

## TL;DR

This paper presents a multimodal GAN-based method to predict retinal structure and vision changes over time, aiding in personalized eye care.

## Contribution

A novel multimodal GAN with super-resolution and attention mechanisms for forecasting retinal OCT images and visual acuity.

## Key findings

- The model achieved an SSIM of 0.9264, FID of 11.9, and PSNR of 38.1 dB for OCT forecasting.
- The logMAR module had an MAE of 0.052, and the biomarker classifier reached a macro-F1 score of 0.81.
- Patients were categorized into outcome groups with an overall F1 score of 0.84 based on logMAR change forecasting.

## Abstract

Generative Adversarial Networks (GANs) have emerged as powerful tools for medical image synthesis and clinical outcome prediction. In ophthalmology, accurate forecasting of Optical Coherence Tomography (OCT) images and best-corrected visual acuity (BCVA) values can significantly enhance patient monitoring and personalized treatment planning. We introduce a multimodal GAN inspired by the StyleGAN architecture, featuring super-resolution modules, a multi-scale patch discriminator, and temporal attention mechanisms. To predict logMAR values, a hybrid deep–shallow LSTM model was jointly trained alongside the image pipeline. Synthesized scans were processed through an EfficientNet-based classifier to predict 16 retinal biomarkers. To ensure subject independence, we employed a 3-fold patient-level cross-validation strategy. The proposed multimodal GAN achieved an SSIM of 0.9264, an FID of 11.9, and a PSNR of 38.1 dB for OCT forecasting. The logMAR module delivered an MAE of 0.052, while the biomarker classifier attained a macro-F1 score of 0.81. Based on logMAR change forecasting, patients were further categorized into Winner, Stabilizer, and Loser outcome groups using a threshold of Δ=0.05, achieving an overall F1 score of 0.84. Our approach effectively forecasts retinal morphology and functional outcomes, providing valuable predictive insights for proactive clinical decision-making in retinal health management.

## Full-text entities

- **Genes:** CST12P (cystatin 12, pseudogene) [NCBI Gene 106478911] {aka Cst, Ctes4, E2}, VEGFA (vascular endothelial growth factor A) [NCBI Gene 7422] {aka L-VEGF, MVCD1, VEGF, VPF}
- **Diseases:** DME (MESH:D008269), diabetic retinopathy (MESH:D003930), retinal thickening (MESH:D012173), glaucoma (MESH:D005901), retinal disease (MESH:D012164), depressions (MESH:D003866), AMD (MESH:D008268), atrophy (MESH:D001284), edema (MESH:D004487), RPE (MESH:C536309), visual impairment (MESH:D014786), OLIVES (MESH:C535922), injury to (MESH:D014947), IR (MESH:D006949), Serous PED (MESH:D012163), VD (MESH:C536356), confusions (MESH:D003221), hemorrhage (MESH:D006470), lesion (MESH:D009059)
- **Chemicals:** DENet (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12938035/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12938035/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12938035/full.md

---
Source: https://tomesphere.com/paper/PMC12938035