TL;DR
This paper introduces a distributed speech separation algorithm for spatially unconstrained microphone arrays that leverages neural networks and spatial information to improve separation performance, especially with multiple sources and nodes.
Contribution
It presents a novel distributed neural network-based algorithm that exploits spatial information in unconstrained microphone arrays for improved speech separation.
Findings
Performance improves with more sources and nodes.
The algorithm outperforms traditional methods in meeting room scenarios.
Robustness to source number mismatch is demonstrated.
Abstract
Speech separation with several speakers is a challenging task because of the non-stationarity of the speech and the strong signal similarity between interferent sources. Current state-of-the-art solutions can separate well the different sources using sophisticated deep neural networks which are very tedious to train. When several microphones are available, spatial information can be exploited to design much simpler algorithms to discriminate speakers. We propose a distributed algorithm that can process spatial information in a spatially unconstrained microphone array. The algorithm relies on a convolutional recurrent neural network that can exploit the signal diversity from the distributed nodes. In a typical case of a meeting room, this algorithm can capture an estimate of each source in a first step and propagate it over the microphone array in order to increase the separation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
