Data Augmentation for Pathological Speech Enhancement

Mingchi Hou; Enno Hermann; Ina Kodrasi

arXiv:2602.14671·eess.AS·February 17, 2026

Data Augmentation for Pathological Speech Enhancement

Mingchi Hou, Enno Hermann, Ina Kodrasi

PDF

Open Access

TL;DR

This study evaluates various data augmentation techniques to enhance speech enhancement models for pathological speech, finding noise augmentation most effective but noting persistent performance gaps.

Contribution

It systematically compares transformative, generative, and noise augmentation strategies for pathological speech enhancement, revealing their relative effectiveness and limitations.

Findings

01

Noise augmentation yields the largest performance gains.

02

Transformative augmentation provides moderate improvements.

03

Generative augmentation can harm performance with more synthetic data.

Abstract

The performance of state-of-the-art speech enhancement (SE) models considerably degrades for pathological speech due to atypical acoustic characteristics and limited data availability. This paper systematically investigates data augmentation (DA) strategies to improve SE performance for pathological speakers, evaluating both predictive and generative SE models. We examine three DA categories, i.e., transformative, generative, and noise augmentation, assessing their impact with objective SE metrics. Experimental results show that noise augmentation consistently delivers the largest and most robust gains, transformative augmentations provide moderate improvements, while generative augmentation yields limited benefits and can harm performance as the amount of synthetic data increases. Furthermore, we show that the effectiveness of DA varies depending on the SE model, with DA being more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Voice and Speech Disorders · Speech Recognition and Synthesis