DINOv3-Guided Cross Fusion Framework for Semantic-aware CT generation from MRI and CBCT
Xianhao Zhou, Jianghao Wu, Ku Zhao, Jinlong He, Huangxuan Zhao, Lei Chen, Shaoting Zhang, Guotai Wang

TL;DR
This paper introduces a novel framework combining self-supervised Transformer and CNN to generate high-quality synthetic CT images from MRI and CBCT, improving semantic accuracy and image quality for medical applications.
Contribution
It presents the first use of DINOv3 representations in medical image translation, integrating a cross fusion module and a perceptual loss for enhanced semantic-aware CT synthesis.
Findings
Achieved state-of-the-art MS-SSIM, PSNR, and segmentation metrics.
Demonstrated effectiveness on MRI to CT and CBCT to CT translation tasks.
First application of DINOv3 in medical image translation.
Abstract
Generating synthetic CT images from CBCT or MRI has a potential for efficient radiation dose planning and adaptive radiotherapy. However, existing CNN-based models lack global semantic understanding, while Transformers often overfit small medical datasets due to high model capacity and weak inductive bias. To address these limitations, we propose a DINOv3-Guided Cross Fusion (DGCF) framework that integrates a frozen self-supervised DINOv3 Transformer with a trainable CNN encoder-decoder. It hierarchically fuses global representation of Transformer and local features of CNN via a learnable cross fusion module, achieving balanced local appearance and contextual representation. Furthermore, we introduce a Multi-Level DINOv3 Perceptual (MLDP) loss that encourages semantic similarity between synthetic CT and the ground truth in DINOv3's feature space. Experiments on the SynthRAD2023 pelvic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Radiomics and Machine Learning in Medical Imaging
