SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound   Source Localization

Bing Yang; Hong Liu; Xiaofei Li

arXiv:2202.07859·cs.SD·February 17, 2022

SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization

Bing Yang, Hong Liu, Xiaofei Li

PDF

Open Access

TL;DR

This paper introduces SRP-DNN, a deep learning approach that learns direct-path phase differences for localizing multiple moving sound sources, improving accuracy in noisy and reverberant environments.

Contribution

The paper proposes a novel neural network architecture to directly learn and utilize direct-path phase differences for multi-source localization, addressing assignment ambiguity and source interaction issues.

Findings

01

Outperforms existing methods in noisy environments

02

Effective in real-world reverberant scenarios

03

Accurately localizes multiple moving sources

Abstract

Multiple moving sound source localization in real-world scenarios remains a challenging issue due to interaction between sources, time-varying trajectories, distorted spatial cues, etc. In this work, we propose to use deep learning techniques to learn competing and time-varying direct-path phase differences for localizing multiple moving sound sources. A causal convolutional recurrent neural network is designed to extract the direct-path phase difference sequence from signals of each microphone pair. To avoid the assignment ambiguity and the problem of uncertain output-dimension encountered when simultaneously predicting multiple targets, the learning target is designed in a weighted sum format, which encodes source activity in the weight and direct-path phase differences in the summed value. The learned direct-path phase differences for all microphone pairs can be directly used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Acoustic Wave Phenomena Research