MultiViT2: A Data-augmented Multimodal Neuroimaging Prediction Framework via Latent Diffusion Model
Bi Yuda, Jia Sihan, Gao Yutong, Abrol Anees, Fu Zening, Calhoun Vince

TL;DR
MultiViT2 is a novel neuroimaging prediction framework that combines multimodal data, a vision transformer, and a latent diffusion-based data augmentation to improve schizophrenia classification accuracy.
Contribution
It introduces a data augmentation module based on latent diffusion models and integrates it with a vision transformer for enhanced neuroimaging predictions.
Findings
Outperforms previous models in schizophrenia classification accuracy
Demonstrates strong scalability and portability
Reduces overfitting through data augmentation
Abstract
Multimodal medical imaging integrates diverse data types, such as structural and functional neuroimaging, to provide complementary insights that enhance deep learning predictions and improve outcomes. This study focuses on a neuroimaging prediction framework based on both structural and functional neuroimaging data. We propose a next-generation prediction model, \textbf{MultiViT2}, which combines a pretrained representative learning base model with a vision transformer backbone for prediction output. Additionally, we developed a data augmentation module based on the latent diffusion model that enriches input data by generating augmented neuroimaging samples, thereby enhancing predictive performance through reduced overfitting and improved generalizability. We show that MultiViT2 significantly outperforms the first-generation model in schizophrenia classification accuracy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
MethodsDense Connections · Layer Normalization · Vision Transformer · Diffusion · Latent Diffusion Model · Balanced Selection
