SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays
Nicolas Furnon (LORIA), Romain Serizel (MULTISPEECH, LORIA), Slim, Essid (IDS, S2A, LTCI), Irina Illina (LORIA)

TL;DR
This paper introduces SAMbA, a deep neural network approach for speech enhancement in ad-hoc microphone arrays that effectively handles device asynchronization using an attention mechanism, avoiding costly resynchronization.
Contribution
The paper presents a novel DNN-based speech enhancement method that is robust to asynchronization in ad-hoc microphone arrays through an attention mechanism, eliminating the need for resynchronization.
Findings
Attention mechanism improves robustness to asynchronization
Asynchronization mainly affects DNN performance, not spatial filtering
Unsupervised estimation of asynchronization parameters
Abstract
Speech enhancement in ad-hoc microphone arrays is often hindered by the asynchronization of the devices composing the microphone array. Asynchronization comes from sampling time offset and sampling rate offset which inevitably occur when the microphones are embedded in different hardware components. In this paper, we propose a deep neural network (DNN)-based speech enhancement solution that is suited for applications in ad-hoc microphone arrays because it is distributed and copes with asynchronization. We show that asynchronization has a limited impact on the spatial filtering and mostly affects the performance of the DNNs. Instead of resynchronising the signals, which requires costly processing steps, we use an attention mechanism which makes the DNNs, thus our whole pipeline, robust to asynchronization. We also show that the attention mechanism leads to the asynchronization parameters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Millimeter-Wave Propagation and Modeling
