On the importance of normative data in speech-based assessment

Zeinab Noorian; Chlo\'e Pou-Prom; Frank Rudzicz

arXiv:1712.00069·cs.CL·December 4, 2017·22 cites

On the importance of normative data in speech-based assessment

Zeinab Noorian, Chlo\'e Pou-Prom, Frank Rudzicz

PDF

Open Access

TL;DR

This paper demonstrates that augmenting sparse Alzheimer's datasets with normative data and using oversampling techniques significantly improves binary classification accuracy for AD detection.

Contribution

It introduces a novel approach of combining patient data with normative datasets and applies oversampling to enhance AD classification performance.

Findings

01

Outperforms state-of-the-art in AD classification

02

Combining normative and patient data is effective

03

Oversampling improves model accuracy

Abstract

Data sets for identifying Alzheimer's disease (AD) are often relatively sparse, which limits their ability to train generalizable models. Here, we augment such a data set, DementiaBank, with each of two normative data sets, the Wisconsin Longitudinal Study and Talk2Me, each of which employs a speech-based picture-description assessment. Through minority class oversampling with ADASYN, we outperform state-of-the-art results in binary classification of people with and without AD in DementiaBank. This work highlights the effectiveness of combining sparse and difficult-to-acquire patient data with relatively large and easily accessible normative datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Phonetics and Phonology Research · Topic Modeling