TL;DR
This paper presents a multi-modal fusion model using gating mechanisms to combine audio, lexical, and disfluency features for improved Alzheimer's Disease detection and severity prediction from spontaneous speech.
Contribution
It introduces a novel gating-based fusion approach that integrates unimodal LSTM decisions for better cognitive impairment assessment.
Findings
Model achieves promising results on ADReSS challenge datasets.
Disfluency features relate to cognitive impairment levels.
Sequence modeling effectively detects Alzheimer's from speech data.
Abstract
This paper is a submission to the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) challenge, which aims to develop methods that can assist in the automated prediction of severity of Alzheimer's Disease from speech data. We focus on acoustic and natural language features for cognitive impairment detection in spontaneous speech in the context of Alzheimer's Disease Diagnosis and the mini-mental state examination (MMSE) score prediction. We proposed a model that obtains unimodal decisions from different LSTMs, one for each modality of text and audio, and then combines them using a gating mechanism for the final prediction. We focused on sequential modelling of text and audio and investigated whether the disfluencies present in individuals' speech relate to the extent of their cognitive impairment. Our results show that the proposed classification and regression schemes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
