TL;DR
This study assesses multichannel speech enhancement algorithms at the phoneme level, revealing gender-specific performance differences and highlighting the importance of phoneme and gender considerations in evaluation.
Contribution
It introduces phoneme- and gender-specific analysis for speech enhancement evaluation, uncovering performance disparities not visible at the utterance level.
Findings
Algorithms perform better on female speech with fewer artifacts.
Significant phoneme-level variations between genders affect enhancement quality.
Enhanced performance observed for female speech in perceptual and recognition metrics.
Abstract
Multichannel speech enhancement algorithms are essential for improving the intelligibility of speech signals in noisy environments. These algorithms are usually evaluated at the utterance level, but this approach overlooks the disparities in acoustic characteristics that are observed in different phoneme categories and between male and female speakers. In this paper, we investigate the impact of gender and phonetic content on speech enhancement algorithms. We motivate this approach by outlining phoneme- and gender-specific spectral features. Our experiments reveal that while utterance-level differences between genders are minimal, significant variations emerge at the phoneme level. Results show that the tested algorithms better reduce interference with fewer artifacts on female speech, particularly in plosives, fricatives, and vowels. Additionally, they demonstrate greater performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
