Block Expanded DINORET: Adapting Natural Domain Foundation Models for   Retinal Imaging Without Catastrophic Forgetting

Jay Zoellin; Colin Merk; Mischa Buob; Amr Saad; Samuel Giesser; Tahm; Spitznagel; Ferhat Turgut; Rui Santos; Yukun Zhou; Sigfried Wagner; Pearse A.; Keane; Yih Chung Tham; Delia Cabrera DeBuc; Matthias D. Becker; Gabor M.; Somfai

arXiv:2409.17332·cs.CV·September 27, 2024

Block Expanded DINORET: Adapting Natural Domain Foundation Models for Retinal Imaging Without Catastrophic Forgetting

Jay Zoellin, Colin Merk, Mischa Buob, Amr Saad, Samuel Giesser, Tahm, Spitznagel, Ferhat Turgut, Rui Santos, Yukun Zhou, Sigfried Wagner, Pearse A., Keane, Yih Chung Tham, Delia Cabrera DeBuc, Matthias D. Becker, Gabor M., Somfai

PDF

Open Access

TL;DR

This paper introduces DINORET and BE DINORET, two self-supervised vision models adapted for retinal imaging, employing block expansion to improve domain adaptation and prevent catastrophic forgetting, with superior data efficiency and performance.

Contribution

The study proposes a novel block expansion method for domain adaptation and demonstrates effective fine-tuning of foundation models for retinal imaging without catastrophic forgetting.

Findings

01

Block expansion mitigates catastrophic forgetting.

02

DINORET models outperform RETFound in data efficiency.

03

Models achieve competitive accuracy on retinal tasks.

Abstract

Integrating deep learning into medical imaging is poised to greatly advance diagnostic methods but it faces challenges with generalizability. Foundation models, based on self-supervised learning, address these issues and improve data efficiency. Natural domain foundation models show promise for medical imaging, but systematic research evaluating domain adaptation, especially using self-supervised learning and parameter-efficient fine-tuning, remains underexplored. Additionally, little research addresses the issue of catastrophic forgetting during fine-tuning of foundation models. We adapted the DINOv2 vision transformer for retinal imaging classification tasks using self-supervised learning and generated two novel foundation models termed DINORET and BE DINORET. Publicly available color fundus photographs were employed for model development and subsequent fine-tuning for diabetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis · Cell Image Analysis Techniques · Medical Image Segmentation Techniques

MethodsAttention Is All You Need · Linear Layer · Softmax · Dense Connections · Multi-Head Attention · Layer Normalization · Residual Connection · Vision Transformer