End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays
Yijun Gong, Shupei Liu, Xiao-Lei Zhang

TL;DR
This paper introduces a deep learning approach for 2D sound source localization using ad-hoc microphone arrays, capable of working with randomly distributed microphones in indoor environments.
Contribution
It presents an end-to-end spatial-temporal deep model with attention architecture that handles variable microphone configurations and localizes speakers in complex acoustic conditions.
Findings
High accuracy in reverberant environments
Effective with a single microphone per node
Handles variable microphone array configurations
Abstract
Conventional sound source localization methods are mostly based on a single microphone array that consists of multiple microphones. They are usually formulated as the estimation of the direction of arrival problem. In this paper, we propose a deep-learning-based end-to-end sound source localization method with ad-hoc microphone arrays, where an ad-hoc microphone array is a set of randomly distributed microphone arrays that collaborate with each other. It can produce two-dimensional locations of speakers with only a single microphone per node. Specifically, we divide a targeted indoor space into multiple local areas. We encode each local area by a one-hot code, therefore, the node and speaker locations can be represented by the one-hot codes. Accordingly, the sound source localization problem is formulated as such a classification task of recognizing the one-hot code of the speaker given…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Underwater Acoustics Research
