DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET
Yitong Li, Morteza Ghahremani, Youssef Wally, Christian Wachinger

TL;DR
DiaMond is a novel multi-modal vision Transformer framework that effectively integrates MRI and PET data for improved dementia diagnosis, outperforming existing methods in accuracy and robustness.
Contribution
The paper introduces DiaMond, a new multi-modal vision Transformer with bi-attention and normalization for better MRI and PET data integration in dementia diagnosis.
Findings
Achieves 92.4% accuracy in AD diagnosis
Attains 65.2% accuracy in AD-MCI-CN classification
Reaches 76.5% accuracy in differential diagnosis of AD and FTD
Abstract
Diagnosing dementia, particularly for Alzheimer's Disease (AD) and frontotemporal dementia (FTD), is complex due to overlapping symptoms. While magnetic resonance imaging (MRI) and positron emission tomography (PET) data are critical for the diagnosis, integrating these modalities in deep learning faces challenges, often resulting in suboptimal performance compared to using single modalities. Moreover, the potential of multi-modal approaches in differential diagnosis, which holds significant clinical importance, remains largely unexplored. We propose a novel framework, DiaMond, to address these issues with vision Transformers to effectively integrate MRI and PET. DiaMond is equipped with self-attention and a novel bi-attention mechanism that synergistically combine MRI and PET, alongside a multi-modal normalization to reduce redundant dependency, thereby boosting the performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification
MethodsBilinear Attention
