GuiDINO: Rethinking Vision Foundation Model in Medical Image Segmentation
Zhuonan Liang, Wei Guo, Jie Gan, Yaxuan Song, Runnan Chen, Hang Chang, Weidong Cai

TL;DR
GuiDINO introduces a novel framework that leverages foundation vision models as visual guidance generators to enhance medical image segmentation, improving accuracy and boundary robustness without extensive fine-tuning.
Contribution
The paper presents GuiDINO, a new method that repositions foundation models as guidance generators for segmentation, enabling efficient adaptation and improved performance in medical imaging tasks.
Findings
Consistently improves segmentation quality across diverse datasets
Enhances boundary robustness in medical image segmentation
Supports parameter-efficient adaptation with LoRA
Abstract
Foundation vision models are increasingly adopted in medical image analysis. Due to domain shift, these pretrained models misalign with medical image segmentation needs without being fully fine-tuned or lightly adapted. We introduce GuiDINO, a framework that repositions native foundation model to acting as a visual guidance generator for downstream segmentation. GuiDINO extracts visual feature representation from DINOv3 and converts them into a spatial guide mask via a lightweight TokenBook mechanism, which aggregates token-prototype similarities. This guide mask gates feature activations in multiple segmentation backbones, thereby injecting foundation-model priors while preserving the inductive biases and efficiency of medical dedicated architectures. Training relies on a guide supervision objective loss that aligns the guide mask to ground-truth regions, optionally augmented by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Domain Adaptation and Few-Shot Learning
