X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis

Fabian Bongratz; Tom Nuno Wolf; Jaume Gual Ramon; Christian Wachinger

arXiv:2506.20267·cs.GR·June 26, 2025

X-SiT: Inherently Interpretable Surface Vision Transformers for Dementia Diagnosis

Fabian Bongratz, Tom Nuno Wolf, Jaume Gual Ramon, Christian Wachinger

PDF

Open Access

TL;DR

X-SiT is a novel interpretable neural network for dementia diagnosis that leverages cortical surface features, achieving state-of-the-art accuracy while providing human-understandable explanations and prototypes aligned with disease patterns.

Contribution

It introduces the first inherently interpretable surface-based vision transformer for brain imaging, incorporating a prototypical surface patch decoder for explainability.

Findings

01

State-of-the-art accuracy in Alzheimer's and frontotemporal dementia detection

02

Provides interpretable cortical prototypes aligned with known disease patterns

03

Reveals classification errors through case-based reasoning

Abstract

Interpretable models are crucial for supporting clinical decision-making, driving advances in their development and application for medical images. However, the nature of 3D volumetric data makes it inherently challenging to visualize and interpret intricate and complex structures like the cerebral cortex. Cortical surface renderings, on the other hand, provide a more accessible and understandable 3D representation of brain anatomy, facilitating visualization and interactive exploration. Motivated by this advantage and the widespread use of surface data for studying neurological disorders, we present the eXplainable Surface Vision Transformer (X-SiT). This is the first inherently interpretable neural network that offers human-understandable predictions based on interpretable cortical features. As part of X-SiT, we introduce a prototypical surface patch decoder for classifying surface…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · CCD and CMOS Imaging Sensors · Machine Learning in Materials Science

MethodsDropout · Dense Connections · Absolute Position Encodings · Layer Normalization · Vision Transformer · ALIGN · Byte Pair Encoding · Label Smoothing · Softmax · Transformer