Improving spatial cues for hearables using a parameterized binaural CDR estimator
Reza Ghanavi, Craig Jin

TL;DR
This paper introduces a new parameterized binaural CDR estimator that improves spatial cue preservation in speech enhancement, outperforming existing methods in objective quality metrics and adaptable to different acoustic environments.
Contribution
A novel robust, parameterized binaural CDR estimator based on geometrical interpretation, enhancing spatial cue preservation and performance in natural, noisy environments.
Findings
Improved PESQ and SRMR scores over state-of-the-art estimators.
Enhanced robustness and adaptability in varying sound environments.
Positive informal subjective evaluation results.
Abstract
We investigate a speech enhancement method based on the binaural coherence-to-diffuse power ratio (CDR), which preserves auditory spatial cues for maskers and a broadside target. Conventional CDR estimators typically rely on a mathematical coherence model of the desired signal and/or diffuse noise field in their formulation, which may influence their accuracy in natural environments. This work proposes a new robust and parameterized directional binaural CDR estimator. The estimator is calculated in the time-frequency domain and is based on a geometrical interpretation of the spatial coherence function between the binaural microphone signals. The binaural performance of the new CDR estimator is compared with three state-of-the-art CDR estimators in cocktail-party-like environments and has shown improvements in terms of several objective speech quality metrics such as PESQ and SRMR. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research
