A Large-scale Medical Visual Task Adaptation Benchmark

Shentong Mo; Xufang Luo; Yansen Wang; Dongsheng Li

arXiv:2404.12876·cs.CV·April 22, 2024·1 cites

A Large-scale Medical Visual Task Adaptation Benchmark

Shentong Mo, Xufang Luo, Yansen Wang, Dongsheng Li

PDF

Open Access

TL;DR

This paper introduces Med-VTAB, a large-scale benchmark for medical visual task adaptation, and proposes GMoE-Adapter, a novel method that improves adaptation performance across diverse medical imaging modalities.

Contribution

It provides the first large-scale benchmark for medical visual task adaptation and introduces GMoE-Adapter, a new approach combining pre-trained weights for better performance.

Findings

01

Single pre-trained models are insufficient for medical task adaptation.

02

GMoE-Adapter achieves state-of-the-art results in medical visual adaptation.

03

Scaling laws and out-of-distribution effects are analyzed.

Abstract

Visual task adaptation has been demonstrated to be effective in adapting pre-trained Vision Transformers (ViTs) to general downstream visual tasks using specialized learnable layers or tokens. However, there is yet a large-scale benchmark to fully explore the effect of visual task adaptation on the realistic and important medical domain, particularly across diverse medical visual modalities, such as color images, X-ray, and CT. To close this gap, we present Med-VTAB, a large-scale Medical Visual Task Adaptation Benchmark consisting of 1.68 million medical images for diverse organs, modalities, and adaptation approaches. Based on Med-VTAB, we explore the scaling law of medical prompt tuning concerning tunable parameters and the generalizability of medical visual adaptation using non-medical/medical pre-train weights. Besides, we study the impact of patient ID out-of-distribution on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVirtual Reality Applications and Impacts