BrainRotViT: Transformer-ResNet Hybrid for Explainable Modeling of Brain Aging from 3D sMRI
Wasif Jalal, Md Nafiu Rahman, Atif Hasan Rahman, M.Sohel Rahman

TL;DR
BrainRotViT is a hybrid transformer-ResNet model that accurately predicts brain age from 3D MRI, offering interpretability and strong generalization across diverse datasets, aiding neurodegeneration research.
Contribution
This work introduces BrainRotViT, a novel hybrid architecture combining ViT and residual CNNs for efficient, interpretable brain age estimation from 3D MRI data.
Findings
Achieves MAE of 3.34 years on validation datasets
Outperforms state-of-the-art models in brain age prediction
Generalizes well across multiple independent cohorts
Abstract
Accurate brain age estimation from structural MRI is a valuable biomarker for studying aging and neurodegeneration. Traditional regression and CNN-based methods face limitations such as manual feature engineering, limited receptive fields, and overfitting on heterogeneous data. Pure transformer models, while effective, require large datasets and high computational cost. We propose Brain ResNet over trained Vision Transformer (BrainRotViT), a hybrid architecture that combines the global context modeling of vision transformers (ViT) with the local refinement of residual CNNs. A ViT encoder is first trained on an auxiliary age and sex classification task to learn slice-level features. The frozen encoder is then applied to all sagittal slices to generate a 2D matrix of embedding vectors, which is fed into a residual CNN regressor that incorporates subject sex at the final fully-connected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFunctional Brain Connectivity Studies · EEG and Brain-Computer Interfaces · Domain Adaptation and Few-Shot Learning
