DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation
Wei Jiang, Feng Liu, Nan Ye, Hongfu Sun

TL;DR
This paper introduces DINO-MVR, a framework that leverages frozen DINOv3 features with a multi-view readout approach, enabling high-performance medical segmentation with minimal annotations.
Contribution
It proposes a novel multi-view readout method that trains lightweight probes on frozen features, eliminating backbone fine-tuning for efficient medical segmentation.
Findings
Achieves high Dice scores on multiple medical benchmarks.
Recovers 98.4% of full-data performance with only five annotated BraTS patients.
Demonstrates effective segmentation without backbone updates.
Abstract
Adapting foundation models to medical segmentation typically requires either backbone fine-tuning or high-capacity task-specific decoders, both of which are difficult to fit reliably when annotations are scarce. We show that frozen DINOv3 features already contain useful structural and boundary cues for medical segmentation, and that the main bottleneck lies in how these features are read out. We propose DINO-MVR, a Multi-View Readout framework for annotation-efficient medical segmentation. DINO-MVR trains only lightweight MLP probes on features from the final three transformer blocks of a frozen DINOv3 backbone, without updating the backbone itself. At inference, each input is interpreted through complementary resolutions and test-time augmentations, whose probability maps are combined by entropy-weighted fusion and refined with simple spatial regularization. For volumetric inputs,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
