Speech Separation Using Partially Asynchronous Microphone Arrays Without Resampling
Ryan M. Corey, Andrew C. Singer

TL;DR
This paper introduces a novel speech separation technique for asynchronous microphone arrays that avoids resampling by dividing arrays into synchronous subarrays and jointly estimating time-varying signal statistics.
Contribution
The proposed method eliminates the need for offset estimation and resampling in asynchronous array processing by using synchronous subarrays and joint statistical estimation.
Findings
Effective separation of speech sources with moving arrays
Works without estimating sample rate offsets
Applicable to both stationary and moving arrays
Abstract
We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods rely on sample rate offset estimation and resampling, but these offsets can be difficult to estimate if the sources or microphones are moving. We propose a source separation method that does not require offset estimation or signal resampling. Instead, we divide the distributed array into several synchronous subarrays. All arrays are used jointly to estimate the time-varying signal statistics, and those statistics are used to design separate time-varying spatial filters in each array. We demonstrate the method for speech mixtures recorded on both stationary and moving microphone arrays.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
