Multi-channel end-to-end neural network for speech enhancement, source localization, and voice activity detection
Yuan Chen, Yicheng Hsu, Mingsian R. Bai

TL;DR
This paper introduces a novel multi-channel neural network that combines a deep complex convolutional recurrent network with a neural beamformer to simultaneously enhance speech, localize sources, and detect voice activity in a unified framework.
Contribution
It proposes a new neural beamformer architecture integrating multi-channel DCCRN for improved speech enhancement and source localization in a single end-to-end system.
Findings
Effective speech enhancement with preserved quality
Accurate source localization capabilities
Reliable voice activity detection performance
Abstract
Speech enhancement and source localization has been active research for several decades with a wide range of real-world applications. Recently, the Deep Complex Convolution Recurrent network (DCCRN) has yielded impressive enhancement performance for single-channel systems. In this study, a neural beamformer consisting of a beamformer and a novel multi-channel DCCRN is proposed for speech enhancement and source localization. Complex-valued filters estimated by the multi-channel DCCRN serve as the weights of beamformer. In addition, a one-stage learning-based procedure is employed for speech enhancement and source localization. The proposed network composed of the multi-channel DCCRN and the auxiliary network models the sound field, while minimizing the distortionless response loss function. Simulation results show that the proposed neural beamformer is effective in enhancing speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
MethodsConvolution
