Learning Multiple Sound Source 2D Localization

Guillaume Le Moing; Phongtharin Vinayavekhin; Tadanobu Inoue; Jayakorn; Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Don Joven Agravante

arXiv:2012.05515·eess.AS·December 11, 2020

Learning Multiple Sound Source 2D Localization

Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn, Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante

PDF

Open Access

TL;DR

This paper introduces deep learning algorithms for accurately localizing multiple sound sources in 2D space using microphone arrays, with novel representations and metrics validated on synthetic and real data.

Contribution

It presents new deep learning architectures, localization representations, and evaluation metrics for multi-source sound localization in enclosed environments.

Findings

01

Improved localization accuracy over baseline methods

02

Effective on both synthetic and real-world data

03

New metrics enable better comparison of approaches

Abstract

In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two novel localization representations which increase the accuracy. Lastly, new metrics are developed relying on resolution-based multiple source association which enables us to evaluate and compare different localization approaches. We tested our method on both synthetic and real world data. The results show that our method improves upon the previous baseline approach for this problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation