OCTCube-M: A 3D multimodal optical coherence tomography foundation model for retinal and systemic diseases with cross-cohort and cross-device validation
Zixuan Liu, Hanwen Xu, Addie Woicik, Linda G. Shapiro, Marian Blazes,, Yue Wu, Verena Steffen, Catherine Cukras, Cecilia S. Lee, Miao Zhang, Aaron, Y. Lee, Sheng Wang

TL;DR
OCTCube-M is a comprehensive 3D multimodal foundation model that integrates OCT with other retinal imaging modalities, demonstrating superior performance in diagnosing retinal and systemic diseases across diverse cohorts and devices.
Contribution
This work introduces OCTCube-M, a novel 3D multi-modal foundation model utilizing contrastive learning to unify OCT with other retinal imaging modalities, enhancing diagnostic accuracy and generalizability.
Findings
Achieves top performance in retinal disease prediction across multiple datasets.
Accurately predicts systemic diseases like diabetes and hypertension.
Enables effective cross-modal image retrieval and analysis.
Abstract
We present OCTCube-M, a 3D OCT-based multi-modal foundation model for jointly analyzing OCT and en face images. OCTCube-M first developed OCTCube, a 3D foundation model pre-trained on 26,685 3D OCT volumes encompassing 1.62 million 2D OCT images. It then exploits a novel multi-modal contrastive learning framework COEP to integrate other retinal imaging modalities, such as fundus autofluorescence and infrared retinal imaging, into OCTCube, efficiently extending it into multi-modal foundation models. OCTCube achieves best performance on predicting 8 retinal diseases, demonstrating strong generalizability on cross-cohort, cross-device and cross-modality prediction. OCTCube can also predict cross-organ nodule malignancy (CT) and low cardiac ejection fraction as well as systemic diseases, such as diabetes and hypertension, revealing its wide applicability beyond retinal diseases. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical Coherence Tomography Applications · Cell Image Analysis Techniques
MethodsContrastive Learning
