# Vision and convolutional transformers for Alzheimer’s disease diagnosis: a systematic review of architectures, multimodal fusion and critical gaps

**Authors:** Ibrahem Afifi, Mostafa Elgendy, Mohamed Abdelfatah, Shaker El-Sappagh

PMC · DOI: 10.1186/s40708-025-00286-7 · 2025-12-17

## TL;DR

This paper reviews how Vision and Convolutional Transformers are used for diagnosing Alzheimer’s disease, highlighting trends and gaps in current research.

## Contribution

The paper introduces novel taxonomies for categorizing AD diagnosis studies using ViTs and CViTs, and identifies critical gaps in multimodal integration and reproducibility.

## Key findings

- Hybrid CViT frameworks are increasingly used in Alzheimer’s diagnosis.
- There is a limited focus on Mild Cognitive Impairment-to-AD progression in current studies.
- Algorithmic reproducibility remains a significant challenge in the field.

## Abstract

Alzheimer’s disease (AD), a significant public health challenge, requires accurate early diagnosis to improve patient outcomes. Vision Transformers (ViTs) and Convolutional Vision Transformers (CViTs) have emerged as powerful Deep Learning architectures for this task. Following PRISMA guidelines, this systematic review analyzes 68 studies selected from 564 publications (2021–2025) across five major databases: Scopus, Web of Science, ScienceDirect, IEEE Xplore, and PubMed. We introduce novel taxonomies to systematically categorize these works by model architecture, data modality, fusion strategy, and diagnostic objective. Our analysis reveals key trends, such as the rise of hybrid CViT frameworks, and critical gaps, including a limited focus on Mild Cognitive Impairment-to-AD progression. Critically, we also assess practical implementation details, revealing widespread challenges in algorithmic reproducibility. The discussion culminates in a forward-looking analysis of Large Vision Models and proposes future directions emphasizing the need for robust multimodal integration, lightweight transformer designs, and Explainable AI to advance AD research and bridge the critical gap between high-performance modeling and clinical applicability.

## Linked entities

- **Diseases:** Alzheimer’s disease (MONDO:0004975)

## Full-text entities

- **Diseases:** Cognitive Impairment (MESH:D003072), AD (MESH:D000544)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

44 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12764722/full.md

---
Source: https://tomesphere.com/paper/PMC12764722