Leveraging Geometrical Acoustic Simulations of Spatial Room Impulse Responses for Improved Sound Event Detection and Localization
Christopher Ick, Brian McFee

TL;DR
This paper explores using geometrical acoustic simulations to generate synthetic spatial audio data, enhancing sound event detection and localization models without extensive real-world recordings.
Contribution
It introduces a novel SRIR dataset generated via geometrical acoustics, demonstrating its effectiveness for training and augmenting SELD models.
Findings
Simulated SRIR data achieves comparable performance to real data.
Augmenting datasets with simulated data improves model benchmarks.
Geometrical acoustics can effectively generate synthetic spatial audio for SELD.
Abstract
As deeper and more complex models are developed for the task of sound event localization and detection (SELD), the demand for annotated spatial audio data continues to increase. Annotating field recordings with 360 video takes many hours from trained annotators, while recording events within motion-tracked laboratories are bounded by cost and expertise. Because of this, localization models rely on a relatively limited amount of spatial audio data in the form of spatial room impulse response (SRIR) datasets, which limits the progress of increasingly deep neural network based approaches. In this work, we demonstrate that simulated geometrical acoustics can provide an appealing solution to this problem. We use simulated geometrical acoustics to generate a novel SRIR dataset that can train a SELD model to provide similar performance to that of a real SRIR dataset. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
