DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation

Jianwei Wang; Qing Wang; Menglan Ruan; Rongjun Ge; Chunfeng Yang; Yang Chen; Chunming Xie

arXiv:2512.08337·cs.CV·December 10, 2025

DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation

Jianwei Wang, Qing Wang, Menglan Ruan, Rongjun Ge, Chunfeng Yang, Yang Chen, Chunming Xie

PDF

Open Access

TL;DR

This paper introduces DINO-BOLDNet, a novel transformer-guided neural network that generates BOLD images from T1-weighted images, improving structural-to-functional mapping with self-supervised learning and multi-slice attention.

Contribution

The paper presents the first framework for directly generating mean BOLD images from T1w images using a DINOv3-guided multi-slice attention model with a perceptual loss.

Findings

01

Outperforms conditional GAN baseline in PSNR and MS-SSIM

02

Uses self-supervised DINOv3 for structural feature extraction

03

Achieves accurate BOLD image generation from T1w images

Abstract

Generating BOLD images from T1w images offers a promising solution for recovering missing BOLD information and enabling downstream tasks when BOLD images are corrupted or unavailable. Motivated by this, we propose DINO-BOLDNet, a DINOv3-guided multi-slice attention framework that integrates a frozen self-supervised DINOv3 encoder with a lightweight trainable decoder. The model uses DINOv3 to extract within-slice structural representations, and a separate slice-attention module to fuse contextual information across neighboring slices. A multi-scale generation decoder then restores fine-grained functional contrast, while a DINO-based perceptual loss encourages structural and textural consistency between predictions and ground-truth BOLD in the transformer feature space. Experiments on a clinical dataset of 248 subjects show that DINO-BOLDNet surpasses a conditional GAN baseline in both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications