Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for   VFSS Segmentations

Chengxi Zeng; David Smithard; Alberto M Gambaruto; Tilo Burghardt

arXiv:2501.18474·cs.CV·January 31, 2025

Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations

Chengxi Zeng, David Smithard, Alberto M Gambaruto, Tilo Burghardt

PDF

Open Access

TL;DR

This paper introduces a test-time training method using point prompts to improve foundation model segmentation performance on medical images without full annotations, demonstrated on a new VFSS dataset.

Contribution

It proposes a novel semi-self-supervised test-time training approach guided by point prompts, reducing annotation costs for medical image segmentation.

Findings

01

Achieved an average Dice coefficient of 0.868 on VFSS-5k dataset.

02

Effectively improves segmentation performance without full annotations.

03

Demonstrates applicability in medical imaging tasks.

Abstract

Vision foundation models have demonstrated exceptional generalization capabilities in segmentation tasks for both generic and specialized images. However, a performance gap persists between foundation models and task-specific, specialized models. Fine-tuning foundation models on downstream datasets is often necessary to bridge this gap. Unfortunately, obtaining fully annotated ground truth for downstream datasets is both challenging and costly. To address this limitation, we propose a novel test-time training paradigm that enhances the performance of foundation models on downstream datasets without requiring full annotations. Specifically, our method employs simple point prompts to guide a test-time semi-self-supervised training task. The model learns by resolving the ambiguity of the point prompt through various augmentations. This approach directly tackles challenges in the medical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrared Target Detection Methodologies