3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty
Rongzhi Gu, Shi-Xiong Zhang, Dong Yu

TL;DR
This paper introduces a 3D neural beamforming approach for multi-channel speech separation that effectively handles location uncertainty and improves separation in complex spatial scenarios.
Contribution
It extends traditional 1D directional beam patterns to 3D, enabling separation of speakers with similar directions but different elevations or distances, and introduces a 3D region feature for uncertainty handling.
Findings
Achieves comparable performance to ground-truth location models
Effectively separates speakers with similar directions but different elevations
Demonstrates robustness in in-car scenarios
Abstract
Multi-channel speech separation using speaker's directional information has demonstrated significant gains over blind speech separation. However, it has two limitations. First, substantial performance degradation is observed when the coming directions of two sounds are close. Second, the result highly relies on the precise estimation of the speaker's direction. To overcome these issues, this paper proposes 3D features and an associated 3D neural beamformer for multi-channel speech separation. Previous works in this area are extended in two important directions. First, the traditional 1D directional beam patterns are generalized to 3D. This enables the model to extract speech from any target region in the 3D space. Thus, speakers with similar directions but different elevations or distances become separable. Second, to handle the speaker location uncertainty, previously proposed spatial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques
