When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks
David Isztl, Tahm Spitznagel, Gabor Mark Somfai, Rui Santos

TL;DR
This study systematically evaluates the necessity and cost-effectiveness of large domain-specific foundation models versus compact general-purpose models for retinal imaging tasks, finding that smaller models often suffice.
Contribution
It provides a comprehensive benchmark across multiple retinal tasks, demonstrating that compact general-purpose models generally outperform or match large domain-specific models in efficiency and accuracy.
Findings
Pretraining universally improves performance, especially on difficult tasks.
Compact models like SwinV2-tiny often outperform larger domain-specific models.
Specialized models are only justified for fine-grained, imbalanced classification tasks.
Abstract
Large vision foundation models have been widely adopted for retinal disease classification without systematic evidence justifying their parameter requirements. In the present work we address two critical questions: First, are large domain-specific foundation models essential, or do compact general-purpose architectures suffice? Second, does specialized retinal pretraining justify its computational cost? To answer this, we benchmark initialization strategies across four retinal imaging classification tasks spanning Optical Coherence Tomography (OCT) and Color Fundus Photography (CFP) modalities: 8-class OCT classification, 3-class diabetic macular edema (DME), 5-class diabetic retinopathy (DR), and 3-class glaucoma (GL) detection. We evaluate 12-13 model configurations per task, including vision transformers (22.8M-86.6M parameters), Swin Transformers (27.6M-28.3M), ConvNeXt (28.6M), and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Retinal Diseases and Treatments · Domain Adaptation and Few-Shot Learning
