MC-ViViT: Multi-branch Classifier-ViViT to detect Mild Cognitive Impairment in older adults using facial videos
Jian Sun, Hiroko H. Dodge, and Mohammad H. Mahoor

TL;DR
This paper introduces MC-ViViT, a novel multi-branch transformer model that effectively detects Mild Cognitive Impairment from facial videos, achieving high accuracy despite dataset imbalance.
Contribution
The paper presents a new multi-branch ViViT architecture with a specialized loss function to handle imbalanced data in MCI detection from facial videos.
Findings
Achieved 90.63% accuracy on the I-CONECT dataset.
Demonstrated effectiveness of the HP Loss in imbalanced classification.
Showed potential of facial video analysis for early MCI detection.
Abstract
Deep machine learning models including Convolutional Neural Networks (CNN) have been successful in the detection of Mild Cognitive Impairment (MCI) using medical images, questionnaires, and videos. This paper proposes a novel Multi-branch Classifier-Video Vision Transformer (MC-ViViT) model to distinguish MCI from those with normal cognition by analyzing facial features. The data comes from the I-CONECT, a behavioral intervention trial aimed at improving cognitive function by providing frequent video chats. MC-ViViT extracts spatiotemporal features of videos in one branch and augments representations by the MC module. The I-CONECT dataset is challenging as the dataset is imbalanced containing Hard-Easy and Positive-Negative samples, which impedes the performance of MC-ViViT. We propose a loss function for Hard-Easy and Positive-Negative Samples (HP Loss) by combining Focal loss and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDementia and Cognitive Impairment Research · Machine Learning in Healthcare · Artificial Intelligence in Healthcare
MethodsMulti-Head Attention · Attention Is All You Need · Dropout · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Softmax · Focal Loss · Linear Layer · Byte Pair Encoding · Layer Normalization
