DinoDental: Benchmarking DINOv3 as a Unified Vision Encoder for Dental Image Analysis
Kun Tang, Xinquan Yang, Mianjie Zheng, Xuefen Liu, Xuguang Li, Xiaoqi Guo, Ruihan Chen, Linlin Shen, He Meng

TL;DR
DinoDental systematically evaluates DINOv3 as a versatile, off-the-shelf encoder for dental image analysis, demonstrating its effectiveness across various tasks and imaging modalities without domain-specific pre-training.
Contribution
The paper introduces DinoDental, a comprehensive benchmark for assessing DINOv3's transferability and performance in dental imaging tasks, including analysis of adaptation strategies and model scaling.
Findings
DINOv3 performs well across dental image analysis tasks.
Scaling model size and input resolution improves performance.
Different adaptation strategies affect transfer effectiveness.
Abstract
The scarcity and high cost of expert annotations in dental imaging present a significant challenge for the development of AI in dentistry. DINOv3, a state-of-the-art, self-supervised vision foundation model pre-trained on 1.7 billion images, offers a promising pathway to mitigate this issue. However, its reliability when transferred to the dental domain, with its unique imaging characteristics and clinical subtleties, remains unclear. To address this, we introduce DinoDental, a unified benchmark designed to systematically evaluate whether DINOv3 can serve as a reliable, off-the-shelf encoder for comprehensive dental image analysis without requiring domain-specific pre-training. Constructed from multiple public datasets, DinoDental covers a wide range of tasks, including classification, detection, and instance segmentation on both panoramic radiographs and intraoral photographs. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
