Evaluating Picture Description Speech for Dementia Detection using Image-text Alignment
Youxiang Zhu, Nana Lin, Xiaohui Liang, John A. Batsis, Robert M. Roth,, Brian MacWhinney

TL;DR
This paper introduces novel dementia detection models that integrate picture information with speech descriptions using large image-text alignment models, significantly improving detection accuracy over text-only methods.
Contribution
It is the first to incorporate picture and text data together with pre-trained image-text alignment models for dementia detection, enhancing accuracy and interpretability.
Findings
Achieved state-of-the-art detection accuracy of 83.44%.
Using picture relevance improves model performance.
Models effectively categorize sentences by focused picture areas.
Abstract
Using picture description speech for dementia detection has been studied for 30 years. Despite the long history, previous models focus on identifying the differences in speech patterns between healthy subjects and patients with dementia but do not utilize the picture information directly. In this paper, we propose the first dementia detection models that take both the picture and the description texts as inputs and incorporate knowledge from large pre-trained image-text alignment models. We observe the difference between dementia and healthy samples in terms of the text's relevance to the picture and the focused area of the picture. We thus consider such a difference could be used to enhance dementia detection accuracy. Specifically, we use the text's relevance to the picture to rank and filter the sentences of the samples. We also identified focused areas of the picture as topics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Biomedical Text Mining and Ontologies
MethodsFocus
