Rethinking complex-valued deep neural networks for monaural speech enhancement
Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong

TL;DR
This paper critically assesses complex-valued deep neural networks for monaural speech enhancement, finding they do not outperform real-valued networks and are more computationally demanding.
Contribution
It systematically compares complex- and real-valued DNNs, revealing that complex-valued models offer no performance advantage and incur higher computational costs.
Findings
Complex-valued DNNs do not outperform real-valued DNNs in speech enhancement.
Complex models require significantly more computation.
Model capacity is hindered in small models when using complex-valued operations.
Abstract
Despite multiple efforts made towards adopting complex-valued deep neural networks (DNNs), it remains an open question whether complex-valued DNNs are generally more effective than real-valued DNNs for monaural speech enhancement. This work is devoted to presenting a critical assessment by systematically examining complex-valued DNNs against their real-valued counterparts. Specifically, we investigate complex-valued DNN atomic units, including linear layers, convolutional layers, long short-term memory (LSTM), and gated linear units. By comparing complex- and real-valued versions of fundamental building blocks in the recently developed gated convolutional recurrent network (GCRN), we show how different mechanisms for basic blocks affect the performance. We also find that the use of complex-valued operations hinders the model capacity when the model size is small. In addition, we examine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hand Gesture Recognition Systems · Indoor and Outdoor Localization Technologies
MethodsConcatenated Skip Connection · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · U-Net
