Context-aware attention layers coupled with optimal transport domain adaptation and multimodal fusion methods for recognizing dementia from spontaneous speech
Loukas Ilias, Dimitris Askounis

TL;DR
This paper introduces a novel multimodal approach using context-aware attention, optimal transport, and calibration techniques for improved dementia detection from spontaneous speech, achieving state-of-the-art accuracy.
Contribution
It proposes new methods combining context-aware attention, domain adaptation, and calibration for multimodal dementia recognition, addressing limitations of prior models.
Findings
Achieved up to 91.25% accuracy and 91.06% F1-score.
Demonstrated improved performance over existing methods.
Effectively captured intra- and inter-modal interactions.
Abstract
Alzheimer's disease (AD) constitutes a complex neurocognitive disease and is the main cause of dementia. Although many studies have been proposed targeting at diagnosing dementia through spontaneous speech, there are still limitations. Existing state-of-the-art approaches, which propose multimodal methods, train separately language and acoustic models, employ majority-vote approaches, and concatenate the representations of the different modalities either at the input level, i.e., early fusion, or during training. Also, some of them employ self-attention layers, which calculate the dependencies between representations without considering the contextual information. In addition, no prior work has taken into consideration the model calibration. To address these limitations, we propose some new methods for detecting AD patients, which capture the intra- and cross-modal interactions. First,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech and Audio Processing · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Warmup With Linear Decay · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Attention Dropout · Softmax · Layer Normalization · Dropout
