Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models
Eike J. Nustede, J\"orn Anem\"uller

TL;DR
This paper introduces a deep complex U-Network with a probabilistic latent space for speech enhancement, demonstrating significant improvements over previous models in various noisy and reverberant conditions.
Contribution
The paper extends the complex U-Network architecture by integrating a variational latent space, improving speech enhancement performance and generalization in noisy environments.
Findings
Achieves up to 20.2 dB SI-SDR on benchmark datasets.
Outperforms ablated models and previous state-of-the-art methods.
Complex spectrum encoding improves performance in different acoustic conditions.
Abstract
In this paper, we propose to extend the deep, complex U-Network architecture for speech enhancement by incorporating a probabilistic (i.e., variational) latent space model. The proposed model is evaluated against several ablated versions of itself in order to study the effects of the variational latent space model, complex-value processing, and self-attention. Evaluation on the MS-DNS 2020 and Voicebank+Demand datasets yields consistently high performance. E.g., the proposed model achieves an SI-SDR of up to 20.2 dB, about 0.5 to 1.4 dB higher than its ablated version without probabilistic latent space, 2-2.4 dB higher than WaveUNet, and 6.7 dB above PHASEN. Compared to real-valued magnitude spectrogram processing with a variational U-Net, the complex U-Net achieves an improvement of up to 4.5 dB SI-SDR. Complex spectrum encoding as magnitude and phase yields best performance in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Convolution · Concatenated Skip Connection · U-Net
