TL;DR
This paper reviews the first international challenge on sound event localization and detection, analyzing system performances, evaluation metrics, and the impact of joint detection and localization accuracy.
Contribution
It provides a comprehensive overview of the challenge, introduces new joint evaluation metrics, and reevaluates submissions to highlight the importance of integrated performance.
Findings
Joint metrics reveal better-performing systems on combined detection and localization
Some top-ranked systems excelled in individual tasks but not jointly
Reevaluation changed the ranking of systems based on combined performance
Abstract
Sound event localization and detection is a novel area of research that emerged from the combined interest of analyzing the acoustic scene in terms of the spatial and temporal activity of sounds of interest. This paper presents an overview of the first international evaluation on sound event localization and detection, organized as a task of the DCASE 2019 Challenge. A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset. The overview presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems. Common strategies in terms of input features, model architectures, training approaches, exploitation of prior knowledge, and data augmentation are discussed. Since ranking in the challenge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
