UrFound: Towards Universal Retinal Foundation Models via Knowledge-Guided Masked Modeling
Kai Yu, Yang Zhou, Yang Bai, Zhi Da Soh, Xinxing Xu, Rick Siow Mong, Goh, Ching-Yu Cheng, Yong Liu

TL;DR
UrFound introduces a universal retinal foundation model that leverages multimodal images and domain knowledge through a knowledge-guided masked modeling strategy, achieving superior generalization and data efficiency across ophthalmic tasks.
Contribution
The paper presents UrFound, a novel retinal foundation model that integrates multimodal retinal images and expert annotations using a knowledge-guided masked modeling approach for improved generalization.
Findings
Outperforms state-of-the-art models on 8 retinal datasets
Demonstrates strong generalization and data efficiency
Trained on approximately 180k images, surpassing models trained on larger datasets
Abstract
Retinal foundation models aim to learn generalizable representations from diverse retinal images, facilitating label-efficient model adaptation across various ophthalmic tasks. Despite their success, current retinal foundation models are generally restricted to a single imaging modality, such as Color Fundus Photography (CFP) or Optical Coherence Tomography (OCT), limiting their versatility. Moreover, these models may struggle to fully leverage expert annotations and overlook the valuable domain knowledge essential for domain-specific representation learning. To overcome these limitations, we introduce UrFound, a retinal foundation model designed to learn universal representations from both multimodal retinal images and domain knowledge. UrFound is equipped with a modality-agnostic image encoder and accepts either CFP or OCT images as inputs. To integrate domain knowledge into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Scientific Computing and Data Management · Robotics and Automated Systems
