Sound event detection using weakly-labeled semi-supervised data with   GCRNNS, VAT and Self-Adaptive Label Refinement

Robert Harb; Franz Pernkopf

arXiv:1810.06897·cs.SD·October 17, 2018

Sound event detection using weakly-labeled semi-supervised data with GCRNNS, VAT and Self-Adaptive Label Refinement

Robert Harb, Franz Pernkopf

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised sound event detection method using GCRNNs, VAT, and self-adaptive label refinement, achieving significant improvements in detection accuracy on weakly labeled data.

Contribution

The paper proposes a new approach combining gated convolutional recurrent neural networks, virtual adversarial training, and self-adaptive label refinement for weakly labeled sound event detection.

Findings

01

Achieved a macro F-score of 34.6% on the DCASE 2018 challenge.

02

Improved detection performance by 20.5% over baseline.

03

Effectively utilized unlabeled data for training.

Abstract

In this paper, we present a gated convolutional recurrent neural network based approach to solve task 4, large-scale weakly labelled semi-supervised sound event detection in domestic environments, of the DCASE 2018 challenge. Gated linear units and a temporal attention layer are used to predict the onset and offset of sound events in 10s long audio clips. Whereby for training only weakly-labelled data is used. Virtual adversarial training is used for regularization, utilizing both labelled and unlabeled data. Furthermore, we introduce self-adaptive label refinement, a method which allows unsupervised adaption of our trained system to refine the accuracy of frame-level class predictions. The proposed system reaches an overall macro averaged event-based F-score of 34.6%, resulting in a relative improvement of 20.5% over the baseline system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies