Cross-Enhanced Multimodal Fusion of Eye-Tracking and Facial Features for Alzheimer's Disease Diagnosis

Yujie Nie; Jianzhang Ni; Yonglong Ye; Yuan-Ting Zhang; Yun Kwok Wing; Xiangqing Xu; Xin Ma; and Lizhou Fan

arXiv:2510.24777·cs.CV·October 30, 2025

Cross-Enhanced Multimodal Fusion of Eye-Tracking and Facial Features for Alzheimer's Disease Diagnosis

Yujie Nie, Jianzhang Ni, Yonglong Ye, Yuan-Ting Zhang, Yun Kwok Wing, Xiangqing Xu, Xin Ma, and Lizhou Fan

PDF

TL;DR

This paper introduces a novel multimodal fusion framework combining eye-tracking and facial features for Alzheimer's diagnosis, utilizing cross-attention and directional feature modules to improve accuracy and robustness.

Contribution

The study presents a new cross-enhanced fusion approach with specialized modules for inter-modal interaction and directional feature extraction, along with a synchronized multimodal dataset for AD detection.

Findings

01

Achieved 95.11% classification accuracy in distinguishing AD from healthy controls.

02

Outperformed traditional fusion methods in robustness and diagnostic performance.

03

Demonstrated the effectiveness of modeling inter-modal dependencies and modality-specific features.

Abstract

Accurate diagnosis of Alzheimer's disease (AD) is essential for enabling timely intervention and slowing disease progression. Multimodal diagnostic approaches offer considerable promise by integrating complementary information across behavioral and perceptual domains. Eye-tracking and facial features, in particular, are important indicators of cognitive function, reflecting attentional distribution and neurocognitive state. However, few studies have explored their joint integration for auxiliary AD diagnosis. In this study, we propose a multimodal cross-enhanced fusion framework that synergistically leverages eye-tracking and facial features for AD detection. The framework incorporates two key modules: (a) a Cross-Enhanced Fusion Attention Module (CEFAM), which models inter-modal interactions through cross-attention and global enhancement, and (b) a Direction-Aware Convolution Module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.