Leveraging Spatial Cues from Cochlear Implant Microphones to Efficiently Enhance Speech Separation in Real-World Listening Scenes
Feyisayo Olalere, Kiki van der Heijden, Christiaan H. Stronks, Jeroen, Briaire, Johan HM Frijns, Marcel van Gerven

TL;DR
This paper investigates how spatial cues from cochlear implant microphones can be leveraged to improve speech separation in real-world environments, addressing challenges posed by reverberation and spatial ambiguity for assistive hearing devices.
Contribution
It introduces methods to utilize both implicit and explicit spatial cues to enhance speech separation performance in cochlear implant scenarios, especially in complex acoustic environments.
Findings
Spatial cues improve separation for spatially separated talkers
Explicit spatial cues benefit when implicit cues are weak
Training on real-world data enhances model generalizability
Abstract
Speech separation approaches for single-channel, dry speech mixtures have significantly improved. However, real-world spatial and reverberant acoustic environments remain challenging, limiting the effectiveness of these approaches for assistive hearing devices like cochlear implants (CIs). To address this, we quantify the impact of real-world acoustic scenes on speech separation and explore how spatial cues can enhance separation quality efficiently. We analyze performance based on implicit spatial cues (inherent in the acoustic input and learned by the model) and explicit spatial cues (manually calculated spatial features added as auxiliary inputs). Our findings show that spatial cues (both implicit and explicit) improve separation for mixtures with spatially separated and nearby talkers. Furthermore, spatial cues enhance separation when spectral cues are ambiguous, such as when voices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing
