Modality-Aware and Anatomical Vector-Quantized Autoencoding for Multimodal Brain MRI
Mingjie Li, Edward Kim, Yue Zhao, Ehsan Adeli, Kilian M. Pohl

TL;DR
This paper introduces NeuroQuant, a novel modality-aware 3D vector-quantized VAE that effectively reconstructs multi-modal brain MRIs by disentangling anatomical structures from appearance features.
Contribution
It proposes a dual-stream 3D encoder with shared anatomical encoding and modality-specific appearance features, trained with a joint 2D/3D strategy for multi-modal MRI reconstruction.
Findings
NeuroQuant outperforms existing VAEs in reconstruction fidelity.
The shared anatomical codebook captures relationships between distant brain regions.
The model enables scalable cross-modal brain image analysis.
Abstract
Learning a robust Variational Autoencoder (VAE) is a fundamental step for many deep learning applications in medical image analysis, such as MRI synthesizes. Existing brain VAEs predominantly focus on single-modality data (i.e., T1-weighted MRI), overlooking the complementary diagnostic value of other modalities like T2-weighted MRIs. Here, we propose a modality-aware and anatomically grounded 3D vector-quantized VAE (VQ-VAE) for reconstructing multi-modal brain MRIs. Called NeuroQuant, it first learns a shared latent representation across modalities using factorized multi-axis attention, which can capture relationships between distant brain regions. It then employs a dual-stream 3D encoder that explicitly separates the encoding of modality-invariant anatomical structures from modality-dependent appearance. Next, the anatomical encoding is discretized using a shared codebook and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
