Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

Kashaf Gulzar; Korbinian Riedhammer; Elmar N\"oth; Andreas K. Maier; Paula Andrea P\'erez-Toro

arXiv:2603.02937·eess.AS·March 4, 2026

Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

Kashaf Gulzar, Korbinian Riedhammer, Elmar N\"oth, Andreas K. Maier, Paula Andrea P\'erez-Toro

PDF

Open Access

TL;DR

This paper analyzes biases in speech-based models for detecting cognitive impairment and depression, revealing performance disparities across demographic groups and emphasizing the importance of fairness in clinical speech applications.

Contribution

It systematically investigates demographic biases in self-supervised acoustic representations for cognitive impairment detection, highlighting disparities and the need for fairness-aware evaluation.

Findings

01

Wav2Vec 2.0 embeddings outperform traditional features in CI detection.

02

Performance disparities exist across gender and age groups.

03

Cross-task generalization between CI and depression classification is limited.

Abstract

Speech-based detection of cognitive impairment (CI) offers a promising non-invasive approach for early diagnosis, yet performance disparities across demographic and clinical subgroups remain underexplored, raising concerns around fairness and generalizability. This study presents a systematic bias analysis of acoustic-based CI and depression classification using the DementiaBank Pitt Corpus. We compare traditional acoustic features (MFCCs, eGeMAPS) with contextualized speech embeddings from Wav2Vec 2.0 (W2V2), and evaluate classification performance across gender, age, and depression-status subgroups. For CI detection, higher-layer W2V2 embeddings outperform baseline features (UAR up to 80.6\%), but exhibit performance disparities; specifically, females and younger participants demonstrate lower discriminative power (\(AUC\): 0.769 and 0.746, respectively) and substantial specificity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Emotion and Mood Recognition · Speech Recognition and Synthesis