Attention-based distributed speech enhancement for unconstrained   microphone arrays with varying number of nodes

Nicolas Furnon (MULTISPEECH); Romain Serizel (MULTISPEECH); Slim Essid; (ADASP); Irina Illina (MULTISPEECH)

arXiv:2106.07939·eess.SP·June 16, 2021

Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes

Nicolas Furnon (MULTISPEECH), Romain Serizel (MULTISPEECH), Slim Essid, (ADASP), Irina Illina (MULTISPEECH)

PDF

Open Access 1 Repo

TL;DR

This paper introduces an attention-based neural network approach for speech enhancement in ad-hoc microphone arrays that can adapt to varying numbers of devices and handle link failures effectively.

Contribution

It proposes a novel attention mechanism that dynamically weights signals from different microphones, enabling robust speech enhancement in variable and unreliable array configurations.

Findings

01

Effective handling of varying microphone counts.

02

Robustness to link failures in microphone arrays.

03

Improved speech quality in ad-hoc array scenarios.

Abstract

Speech enhancement promises higher efficiency in ad-hoc microphone arrays than in constrained microphone arrays thanks to the wide spatial coverage of the devices in the acoustic scene. However, speech enhancement in ad-hoc microphone arrays still raises many challenges. In particular, the algorithms should be able to handle a variable number of microphones, as some devices in the array might appear or disappear. In this paper, we propose a solution that can efficiently process the spatial information captured by the different devices of the microphone array, while being robust to a link failure. To do this, we use an attention mechanism in order to put more weight on the relevant signals sent throughout the array and to neglect the redundant or empty channels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nfurnon/disco
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Millimeter-Wave Propagation and Modeling · Indoor and Outdoor Localization Technologies