Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy   Diagnosis

Bowen Zhang; Ying Chen; Long Bai; Yan Zhao; Yuxiang Sun; Yixuan Yuan,; Jianhua Zhang; Hongliang Ren

arXiv:2406.10508·cs.CV·July 2, 2024

Learning to Adapt Foundation Model DINOv2 for Capsule Endoscopy Diagnosis

Bowen Zhang, Ying Chen, Long Bai, Yan Zhao, Yuxiang Sun, Yixuan Yuan,, Jianhua Zhang, Hongliang Ren

PDF

Open Access

TL;DR

This paper presents a low-rank adaptation method to fine-tune foundation models like DINOv2 for capsule endoscopy diagnosis, achieving high accuracy on public datasets with minimal training adjustments.

Contribution

Introduces a LoRA-based adaptation approach for foundation models, enabling effective and efficient customization for medical imaging tasks like capsule endoscopy diagnosis.

Findings

01

Achieved 97.75% accuracy on Kvasir-Capsule dataset.

02

Achieved 98.81% accuracy on Kvasirv2 dataset.

03

Demonstrated the effectiveness of LoRA for domain-specific model adaptation.

Abstract

Foundation models have become prominent in computer vision, achieving notable success in various tasks. However, their effectiveness largely depends on pre-training with extensive datasets. Applying foundation models directly to small datasets of capsule endoscopy images from scratch is challenging. Pre-training on broad, general vision datasets is crucial for successfully fine-tuning our model for specific tasks. In this work, we introduce a simplified approach called Adapt foundation models with a low-rank adaptation (LoRA) technique for easier customization. Our method, inspired by the DINOv2 foundation model, applies low-rank adaptation learning to tailor foundation models for capsule endoscopy diagnosis effectively. Unlike traditional fine-tuning methods, our strategy includes LoRA layers designed to absorb specific surgical domain knowledge. During the training process, we keep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGastrointestinal Bleeding Diagnosis and Treatment

MethodsFocus