DCASE 2024 Task 4: Sound Event Detection with Heterogeneous Data and Missing Labels
Samuele Cornell, Janek Ebbers, Constance Douwes, Irene, Mart\'in-Morat\'o, Manu Harju, Annamaria Mesaros, Romain Serizel

TL;DR
This paper discusses the DCASE 2024 Task 4 challenge focused on developing sound event detection systems that can handle heterogeneous data with varying annotation quality and missing labels, aiming for robust performance across diverse environments.
Contribution
It introduces a new challenge setup and an updated baseline system to address training with diverse, incomplete, and inconsistent annotations in sound event detection.
Findings
Using diverse domain data improves SED performance over single-domain training.
The baseline system demonstrates robustness despite missing labels and annotation inconsistencies.
Research indicates potential for more generalized SED systems with heterogeneous data.
Abstract
The Detection and Classification of Acoustic Scenes and Events Challenge Task 4 aims to advance sound event detection (SED) systems in domestic environments by leveraging training data with different supervision uncertainty. Participants are challenged in exploring how to best use training data from different domains and with varying annotation granularity (strong/weak temporal resolution, soft/hard labels), to obtain a robust SED system that can generalize across different scenarios. Crucially, annotation across available training datasets can be inconsistent and hence sound labels of one dataset may be present but not annotated in the other one and vice-versa. As such, systems will have to cope with potentially missing target labels during training. Moreover, as an additional novelty, systems will also be evaluated on labels with different granularity in order to assess their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing
