Single-Microphone Speech Enhancement and Separation Using Deep Learning
Morten Kolb{\ae}k

TL;DR
This paper explores deep learning techniques for single-microphone speech enhancement and separation, demonstrating improved generalizability, state-of-the-art separation results, and effective joint enhancement without prior noise or speaker information.
Contribution
It introduces uPIT, a novel deep learning algorithm for speech separation, and provides insights into training data design for better generalizability of enhancement algorithms.
Findings
uPIT achieves state-of-the-art multi-talker separation results.
Deep learning enhancement algorithms can be optimized for speech intelligibility.
Carefully designed training data improves generalizability of speech enhancement models.
Abstract
The cocktail party problem comprises the challenging task of understanding a speech signal in a complex acoustic environment, where multiple speakers and background noise signals simultaneously interfere with the speech signal of interest. A signal processing algorithm that can effectively increase the speech intelligibility and quality of speech signals in such complicated acoustic situations is highly desirable. Especially for applications involving mobile communication devices and hearing assistive devices. Due to the re-emergence of machine learning techniques, today, known as deep learning, the challenges involved with such algorithms might be overcome. In this PhD thesis, we study and develop deep learning-based techniques for two sub-disciplines of the cocktail party problem: single-microphone speech enhancement and single-microphone multi-talker speech separation. Specifically,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Indoor and Outdoor Localization Technologies
