The Neural-SRP method for positional sound source localization

Eric Grinstein; Toon van Waterschoot; Mike Brookes; Patrick A.; Naylor

arXiv:2403.09455·cs.SD·March 15, 2024·1 cites

The Neural-SRP method for positional sound source localization

Eric Grinstein, Toon van Waterschoot, Mike Brookes, Patrick A., Naylor

PDF

Open Access 1 Repo

TL;DR

Neural-SRP is a deep learning-based method that enhances sound source localization in reverberant environments, offering flexible microphone configurations and outperforming traditional SRP methods.

Contribution

It introduces Neural-SRP, a neural network that combines SRP's flexibility with DNNs' accuracy, trained via simulation and transfer learning for adaptable microphone setups.

Findings

01

Neural-SRP significantly outperforms baseline methods.

02

The approach is effective on both recorded and simulated data.

03

It adapts well to various microphone topologies.

Abstract

Steered Response Power (SRP) is a widely used method for the task of sound source localization using microphone arrays, showing satisfactory localization performance on many practical scenarios. However, its performance is diminished under highly reverberant environments. Although Deep Neural Networks (DNNs) have been previously proposed to overcome this limitation, most are trained for a specific number of microphones with fixed spatial coordinates. This restricts their practical application on scenarios frequently observed in wireless acoustic sensor networks, where each application has an ad-hoc microphone topology. We propose Neural-SRP, a DNN which combines the flexibility of SRP with the performance gains of DNNs. We train our network using simulated data and transfer learning, and evaluate our approach on recorded and simulated data. Results verify that Neural-SRP's localization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

egrinstein/gnn_ssl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Flow Measurement and Analysis