LEAD Dataset: How Can Labels for Sound Event Detection Vary Depending on   Annotators?

Naoki Koga; Yoshiaki Bando; and Keisuke Imoto

arXiv:2410.09778·cs.SD·October 15, 2024

LEAD Dataset: How Can Labels for Sound Event Detection Vary Depending on Annotators?

Naoki Koga, Yoshiaki Bando, and Keisuke Imoto

PDF

Open Access 1 Repo

TL;DR

The LEAD dataset provides multi-annotator sound event labels to analyze how annotation variability affects sound event detection models and to develop more robust approaches.

Contribution

We introduce the LEAD dataset with annotations from 20 different annotators, enabling analysis of label variation and robustness in sound event detection.

Findings

01

Significant variation exists among annotators' labels.

02

Analysis reveals how label differences impact model training.

03

Insights into creating robust SED models considering annotation variability.

Abstract

In this paper, we introduce a LargE-scale Annotator's labels for sound event Detection (LEAD) dataset, which is the dataset used to gain a better understanding of the variation in strong labels in sound event detection (SED). In SED, it is very time-consuming to collect large-scale strong labels, and in most cases, multiple workers divide up the annotations to create a single dataset. In general, strong labels created by multiple annotators have large variations in the type of sound events and temporal onset/offset. Through the annotations of multiple workers, uniquely determining the strong label is quite difficult because the dataset contains sounds that can be mistaken for similar classes and sounds whose temporal onset/offset is difficult to distinguish. If the strong labels of SED vary greatly depending on the annotator, the SED model trained on a dataset created by multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KeisukeImoto/LEAD_dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech Recognition and Synthesis