Magnitude-and-phase-aware Speech Enhancement with Parallel Sequence Modeling
Yuewei Zhang, Huanbin Zou, Jie Zhu

TL;DR
This paper introduces MPCRN, a speech enhancement model that estimates magnitude masks and normalized cIRM using a real network, combined with parallel sequence modeling, achieving superior performance without complex neural networks.
Contribution
It proposes a novel magnitude-and-phase-aware CRN model with parallel sequence modeling, avoiding complex networks and improving speech enhancement quality.
Findings
MPCRN outperforms previous methods in speech enhancement.
Using a real network for magnitude and phase estimation reduces model complexity.
Parallel sequence modeling enhances the RNN-based SE model.
Abstract
In speech enhancement (SE), phase estimation is important for perceptual quality, so many methods take clean speech's complex short-time Fourier transform (STFT) spectrum or the complex ideal ratio mask (cIRM) as the learning target. To predict these complex targets, the common solution is to design a complex neural network, or use a real network to separately predict the real and imaginary parts of the target. But in this paper, we propose to use a real network to estimate the magnitude mask and normalized cIRM, which not only avoids the significant increase of the model complexity caused by complex networks, but also shows better performance than previous phase estimation methods. Meanwhile, we devise a parallel sequence modeling (PSM) block to improve the RNN block in the convolutional recurrent network (CRN)-based SE model. We name our method as magnitude-and-phase-aware and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Indoor and Outdoor Localization Technologies
MethodsConditional Relation Network
