Online Active Learning For Sound Event Detection

Mark Lindsey; Ankit Shah; Francis Kubala; Richard M. Stern

arXiv:2309.14460·eess.AS·September 29, 2023

Online Active Learning For Sound Event Detection

Mark Lindsey, Ankit Shah, Francis Kubala, Richard M. Stern

PDF

Open Access

TL;DR

This paper introduces new loss functions for Online Active Learning in Sound Event Detection, significantly reducing annotation effort and addressing data drift issues, demonstrated on SONYC and VTD datasets.

Contribution

It proposes novel loss functions tailored for OAL in SED, improving efficiency and robustness over existing methods.

Findings

01

OAL reduces training effort by a factor of 5 on SONYC.

02

New loss functions effectively handle data drift in OAL.

03

Improved OAL methods resolve prior issues in SED applications.

Abstract

Data collection and annotation is a laborious, time-consuming prerequisite for supervised machine learning tasks. Online Active Learning (OAL) is a paradigm that addresses this issue by simultaneously minimizing the amount of annotation required to train a classifier and adapting to changes in the data over the duration of the data collection process. Prior work has indicated that fluctuating class distributions and data drift are still common problems for OAL. This work presents new loss functions that address these challenges when OAL is applied to Sound Event Detection (SED). Experimental results from the SONYC dataset and two Voice-Type Discrimination (VTD) corpora indicate that OAL can reduce the time and effort required to train SED classifiers by a factor of 5 for SONYC, and that the new methods presented here successfully resolve issues present in existing OAL methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWater Systems and Optimization · Music and Audio Processing · Data Stream Mining Techniques