Phoneme-Based Ratio Mask Estimation for Reverberant Speech Enhancement   in Cochlear Implant Processors

Kevin M. Chu; Leslie M. Collins; and Boyla O. Mainsah

arXiv:2105.14135·eess.AS·June 1, 2021

Phoneme-Based Ratio Mask Estimation for Reverberant Speech Enhancement in Cochlear Implant Processors

Kevin M. Chu, Leslie M. Collins, and Boyla O. Mainsah

PDF

Open Access

TL;DR

This paper introduces a phoneme-based mask estimation method for reverberant speech enhancement in cochlear implants, showing improved speech intelligibility when perfect phoneme knowledge is used, indicating potential benefits for CI users.

Contribution

It proposes a novel phoneme-specific mask estimation approach that leverages phonemic information to improve reverberant speech enhancement for cochlear implant users.

Findings

01

Phoneme-based masks outperform conventional masks in speech intelligibility.

02

Perfect phoneme knowledge enhances mask effectiveness.

03

Results suggest potential benefits for cochlear implant users in reverberant environments.

Abstract

Cochlear implant (CI) users have considerable difficulty in understanding speech in reverberant listening environments. Time-frequency (T-F) masking is a common technique that aims to improve speech intelligibility by multiplying reverberant speech by a matrix of gain values to suppress T-F bins dominated by reverberation. Recently proposed mask estimation algorithms leverage machine learning approaches to distinguish between target speech and reverberant reflections. However, the spectro-temporal structure of speech is highly variable and dependent on the underlying phoneme. One way to potentially overcome this variability is to leverage explicit knowledge of phonemic information during mask estimation. This study proposes a phoneme-based mask estimation algorithm, where separate mask estimation models are trained for each phoneme. Sentence recognition tests were conducted in normal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research