Swin-BERT: A Feature Fusion System designed for Speech-based Alzheimer's   Dementia Detection

Yilin Pan; Yanpei Shi; Yijia Zhang; Mingyu Lu

arXiv:2410.07277·eess.AS·October 11, 2024

Swin-BERT: A Feature Fusion System designed for Speech-based Alzheimer's Dementia Detection

Yilin Pan, Yanpei Shi, Yijia Zhang, Mingyu Lu

PDF

Open Access

TL;DR

Swin-BERT is a novel speech-based system combining acoustic and linguistic features, designed for early Alzheimer's dementia detection, effectively decoupling age and gender influences to improve accuracy.

Contribution

The paper introduces Swin-BERT, a feature fusion system that integrates acoustic and linguistic information with age and gender decoupling for improved dementia detection.

Findings

01

Achieved 85.58% F-score on ADReSS dataset.

02

Achieved 87.32% F-score on ADReSSo dataset.

03

Outperformed previous methods on both datasets.

Abstract

Speech is usually used for constructing an automatic Alzheimer's dementia (AD) detection system, as the acoustic and linguistic abilities show a decline in people living with AD at the early stages. However, speech includes not only AD-related local and global information but also other information unrelated to cognitive status, such as age and gender. In this paper, we propose a speech-based system named Swin-BERT for automatic dementia detection. For the acoustic part, the shifted windows multi-head attention that proposed to extract local and global information from images, is used for designing our acoustic-based system. To decouple the effect of age and gender on acoustic feature extraction, they are used as an extra input of the designed acoustic system. For the linguistic part, the rhythm-related information, which varies significantly between people living with and without AD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition

MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention