Sound Event Localization and Detection Using CRNN on Pairs of Microphones
Francois Grondin, James Glass, Iwona Sobieraj, Mark D. Plumbley

TL;DR
This paper introduces a CRNN-based system for localizing and detecting sound events using microphone pairs, demonstrating improved accuracy over previous methods in a four-microphone array setup.
Contribution
The paper presents a novel CRNN-based approach for joint sound event detection and localization using microphone pairs, outperforming the DCASE 2019 baseline.
Findings
Outperforms DCASE 2019 baseline system
Effective 3-D DOA estimation from microphone pairs
Combines results from six microphone pairs for final classification
Abstract
This paper proposes sound event localization and detection methods from multichannel recording. The proposed system is based on two Convolutional Recurrent Neural Networks (CRNNs) to perform sound event detection (SED) and time difference of arrival (TDOA) estimation on each pair of microphones in a microphone array. In this paper, the system is evaluated with a four-microphone array, and thus combines the results from six pairs of microphones to provide a final classification and a 3-D direction of arrival (DOA) estimate. Results demonstrate that the proposed approach outperforms the DCASE 2019 baseline system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
