Spatial-aware Transformer-GRU Framework for Enhanced Glaucoma Diagnosis from 3D OCT Imaging
Mona Ashtari-Majlan, David Masip

TL;DR
This paper introduces a novel deep learning framework combining Vision Transformer and GRU to analyze 3D OCT images for improved glaucoma diagnosis, demonstrating superior performance over existing methods.
Contribution
It presents a new spatial-aware Transformer-GRU framework that effectively captures local and global features in 3D OCT data for glaucoma detection.
Findings
Achieved an F1-score of 93.01% on a large dataset.
Outperformed state-of-the-art methods in glaucoma detection.
Demonstrated the framework's potential for clinical decision support.
Abstract
Glaucoma, a leading cause of irreversible blindness, necessitates early detection for accurate and timely intervention to prevent irreversible vision loss. In this study, we present a novel deep learning framework that leverages the diagnostic value of 3D Optical Coherence Tomography (OCT) imaging for automated glaucoma detection. In this framework, we integrate a pre-trained Vision Transformer on retinal data for rich slice-wise feature extraction and a bidirectional Gated Recurrent Unit for capturing inter-slice spatial dependencies. This dual-component approach enables comprehensive analysis of local nuances and global structural integrity, crucial for accurate glaucoma diagnosis. Experimental results on a large dataset demonstrate the superior performance of the proposed method over state-of-the-art ones, achieving an F1-score of 93.01%, Matthews Correlation Coefficient (MCC) of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Optical Coherence Tomography Applications · Glaucoma and retinal disorders
MethodsAttention Is All You Need · Dropout · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Softmax · Dense Connections · Label Smoothing · Adam · Residual Connection · Byte Pair Encoding
