TL;DR
This paper presents a distributed DNN-based speech enhancement method for ad-hoc microphone arrays, leveraging spatial information and compressed signals to improve noise reduction in realistic acoustic environments.
Contribution
It extends a previous distributed DNN mask estimation scheme to include noise estimation using compressed signals, enhancing speech enhancement in spatially unconstrained microphone arrays.
Findings
Nodes cooperate by utilizing spatial coverage.
Compressed signals improve noise and target estimation.
Method performs well under realistic acoustic conditions.
Abstract
Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments. However, in the context of ad-hoc microphone arrays, many challenges remain and raise the need for distributed processing. In this paper, we propose to extend a previously introduced distributed DNN-based time-frequency mask estimation scheme that can efficiently use spatial information in form of so-called compressed signals which are pre-filtered target estimations. We study the performance of this algorithm under realistic acoustic conditions and investigate practical aspects of its optimal application. We show that the nodes in the microphone array cooperate by taking profit of their spatial coverage in the room. We also propose to use the compressed signals not only to convey the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
