X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis
Fabian Bongratz, Tom Nuno Wolf, Jaume Gual Ramon, Christian Wachinger

TL;DR
X-SiT is a novel interpretable neural network for dementia diagnosis that leverages cortical surface features, achieving state-of-the-art accuracy while providing human-understandable explanations and prototypes aligned with disease patterns.
Contribution
It introduces the first inherently interpretable surface-based vision transformer for brain imaging, incorporating a prototypical surface patch decoder for explainability.
Findings
State-of-the-art accuracy in Alzheimer's and frontotemporal dementia detection
Provides interpretable cortical prototypes aligned with known disease patterns
Reveals classification errors through case-based reasoning
Abstract
Interpretable models are crucial for supporting clinical decision-making, driving advances in their development and application for medical images. However, the nature of 3D volumetric data makes it inherently challenging to visualize and interpret intricate and complex structures like the cerebral cortex. Cortical surface renderings, on the other hand, provide a more accessible and understandable 3D representation of brain anatomy, facilitating visualization and interactive exploration. Motivated by this advantage and the widespread use of surface data for studying neurological disorders, we present the eXplainable Surface Vision Transformer (X-SiT). This is the first inherently interpretable neural network that offers human-understandable predictions based on interpretable cortical features. As part of X-SiT, we introduce a prototypical surface patch decoder for classifying surface…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · CCD and CMOS Imaging Sensors · Machine Learning in Materials Science
MethodsDropout · Dense Connections · Absolute Position Encodings · Layer Normalization · Vision Transformer · ALIGN · Byte Pair Encoding · Label Smoothing · Softmax · Transformer
