Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection
Kashaf Gulzar, Korbinian Riedhammer, Elmar N\"oth, Andreas K. Maier, Paula Andrea P\'erez-Toro

TL;DR
This paper analyzes biases in speech-based models for detecting cognitive impairment and depression, revealing performance disparities across demographic groups and emphasizing the importance of fairness in clinical speech applications.
Contribution
It systematically investigates demographic biases in self-supervised acoustic representations for cognitive impairment detection, highlighting disparities and the need for fairness-aware evaluation.
Findings
Wav2Vec 2.0 embeddings outperform traditional features in CI detection.
Performance disparities exist across gender and age groups.
Cross-task generalization between CI and depression classification is limited.
Abstract
Speech-based detection of cognitive impairment (CI) offers a promising non-invasive approach for early diagnosis, yet performance disparities across demographic and clinical subgroups remain underexplored, raising concerns around fairness and generalizability. This study presents a systematic bias analysis of acoustic-based CI and depression classification using the DementiaBank Pitt Corpus. We compare traditional acoustic features (MFCCs, eGeMAPS) with contextualized speech embeddings from Wav2Vec 2.0 (W2V2), and evaluate classification performance across gender, age, and depression-status subgroups. For CI detection, higher-layer W2V2 embeddings outperform baseline features (UAR up to 80.6\%), but exhibit performance disparities; specifically, females and younger participants demonstrate lower discriminative power (\(AUC\): 0.769 and 0.746, respectively) and substantial specificity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Emotion and Mood Recognition · Speech Recognition and Synthesis
