SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays

Nicolas Furnon (LORIA); Romain Serizel (MULTISPEECH; LORIA); Slim; Essid (IDS; S2A; LTCI); Irina Illina (LORIA)

arXiv:2307.16582·eess.AS·August 1, 2023

SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays

Nicolas Furnon (LORIA), Romain Serizel (MULTISPEECH, LORIA), Slim, Essid (IDS, S2A, LTCI), Irina Illina (LORIA)

PDF

Open Access

TL;DR

This paper introduces SAMbA, a deep neural network approach for speech enhancement in ad-hoc microphone arrays that effectively handles device asynchronization using an attention mechanism, avoiding costly resynchronization.

Contribution

The paper presents a novel DNN-based speech enhancement method that is robust to asynchronization in ad-hoc microphone arrays through an attention mechanism, eliminating the need for resynchronization.

Findings

01

Attention mechanism improves robustness to asynchronization

02

Asynchronization mainly affects DNN performance, not spatial filtering

03

Unsupervised estimation of asynchronization parameters

Abstract

Speech enhancement in ad-hoc microphone arrays is often hindered by the asynchronization of the devices composing the microphone array. Asynchronization comes from sampling time offset and sampling rate offset which inevitably occur when the microphones are embedded in different hardware components. In this paper, we propose a deep neural network (DNN)-based speech enhancement solution that is suited for applications in ad-hoc microphone arrays because it is distributed and copes with asynchronization. We show that asynchronization has a limited impact on the spatial filtering and mostly affects the performance of the DNNs. Instead of resynchronising the signals, which requires costly processing steps, we use an attention mechanism which makes the DNNs, thus our whole pipeline, robust to asynchronization. We also show that the attention mechanism leads to the asynchronization parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Millimeter-Wave Propagation and Modeling