When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks

David Isztl; Tahm Spitznagel; Gabor Mark Somfai; Rui Santos

arXiv:2511.22001·eess.IV·December 1, 2025

When Do Domain-Specific Foundation Models Justify Their Cost? A Systematic Evaluation Across Retinal Imaging Tasks

David Isztl, Tahm Spitznagel, Gabor Mark Somfai, Rui Santos

PDF

Open Access

TL;DR

This study systematically evaluates the necessity and cost-effectiveness of large domain-specific foundation models versus compact general-purpose models for retinal imaging tasks, finding that smaller models often suffice.

Contribution

It provides a comprehensive benchmark across multiple retinal tasks, demonstrating that compact general-purpose models generally outperform or match large domain-specific models in efficiency and accuracy.

Findings

01

Pretraining universally improves performance, especially on difficult tasks.

02

Compact models like SwinV2-tiny often outperform larger domain-specific models.

03

Specialized models are only justified for fine-grained, imbalanced classification tasks.

Abstract

Large vision foundation models have been widely adopted for retinal disease classification without systematic evidence justifying their parameter requirements. In the present work we address two critical questions: First, are large domain-specific foundation models essential, or do compact general-purpose architectures suffice? Second, does specialized retinal pretraining justify its computational cost? To answer this, we benchmark initialization strategies across four retinal imaging classification tasks spanning Optical Coherence Tomography (OCT) and Color Fundus Photography (CFP) modalities: 8-class OCT classification, 3-class diabetic macular edema (DME), 5-class diabetic retinopathy (DR), and 3-class glaucoma (GL) detection. We evaluate 12-13 model configurations per task, including vision transformers (22.8M-86.6M parameters), Swin Transformers (27.6M-28.3M), ConvNeXt (28.6M), and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis · Retinal Diseases and Treatments · Domain Adaptation and Few-Shot Learning