DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation

Wei Jiang; Feng Liu; Nan Ye; Hongfu Sun

arXiv:2605.07221·cs.CV·May 11, 2026

DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation

Wei Jiang, Feng Liu, Nan Ye, Hongfu Sun

PDF

TL;DR

This paper introduces DINO-MVR, a framework that leverages frozen DINOv3 features with a multi-view readout approach, enabling high-performance medical segmentation with minimal annotations.

Contribution

It proposes a novel multi-view readout method that trains lightweight probes on frozen features, eliminating backbone fine-tuning for efficient medical segmentation.

Findings

01

Achieves high Dice scores on multiple medical benchmarks.

02

Recovers 98.4% of full-data performance with only five annotated BraTS patients.

03

Demonstrates effective segmentation without backbone updates.

Abstract

Adapting foundation models to medical segmentation typically requires either backbone fine-tuning or high-capacity task-specific decoders, both of which are difficult to fit reliably when annotations are scarce. We show that frozen DINOv3 features already contain useful structural and boundary cues for medical segmentation, and that the main bottleneck lies in how these features are read out. We propose DINO-MVR, a Multi-View Readout framework for annotation-efficient medical segmentation. DINO-MVR trains only lightweight MLP probes on features from the final three transformer blocks of a frozen DINOv3 backbone, without updating the backbone itself. At inference, each input is interpreted through complementary resolutions and test-time augmentations, whose probability maps are combined by entropy-weighted fusion and refined with simple spatial regularization. For volumetric inputs,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.