Towards duration robust weakly supervised sound event detection

Heinrich Dinkel; Mengyue Wu; Kai Yu

arXiv:2101.07687·cs.SD·February 8, 2021

Towards duration robust weakly supervised sound event detection

Heinrich Dinkel, Mengyue Wu, Kai Yu

PDF

1 Repo

TL;DR

This paper introduces a duration robust CRNN framework for weakly-supervised sound event detection that performs well without prior duration knowledge, especially on datasets with short events, using novel post-processing and data augmentation techniques.

Contribution

The paper proposes a new CRNN-based model with a Triple Threshold post-processing strategy and data augmentation methods, improving localization performance in weakly-supervised SED without needing duration labels.

Findings

01

Outperforms existing methods on DCASE2018 and URBAN-SED datasets.

02

Achieves similar performance to supervised models on URBAN-SED.

03

Post-processing significantly reduces localization performance drop.

Abstract

Sound event detection (SED) is the task of tagging the absence or presence of audio events and their corresponding interval within a given audio clip. While SED can be done using supervised machine learning, where training data is fully labeled with access to per event timestamps and duration, our work focuses on weakly-supervised sound event detection (WSSED), where prior knowledge about an event's duration is unavailable. Recent research within the field focuses on improving segment- and event-level localization performance for specific datasets regarding specific evaluation metrics. Specifically, well-performing event-level localization requires fully labeled development subsets to obtain event duration estimates, which significantly benefits localization performance. Moreover, well-performing segment-level localization models output predictions at a coarse-scale (e.g., 1 second),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RicherMans/CDur
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.