Forward-Backward Convolutional Recurrent Neural Networks and   Tag-Conditioned Convolutional Neural Networks for Weakly Labeled   Semi-supervised Sound Event Detection

Janek Ebbers; Reinhold Haeb-Umbach

arXiv:2103.06581·eess.AS·March 12, 2021

Forward-Backward Convolutional Recurrent Neural Networks and Tag-Conditioned Convolutional Neural Networks for Weakly Labeled Semi-supervised Sound Event Detection

Janek Ebbers, Reinhold Haeb-Umbach

PDF

1 Repo

TL;DR

This paper introduces novel forward-backward convolutional recurrent neural networks and tag-conditioned CNNs for weakly labeled semi-supervised sound event detection, achieving state-of-the-art results in the DCASE 2020 Challenge.

Contribution

The paper presents two new models, FBCRNN and tag-conditioned CNN, for improved sound event detection using weak labels and pseudo strong labels.

Findings

01

Achieved 18.0% improvement in event-based F1-score over baseline.

02

Outperformed top challenge systems in validation set.

03

Proposed models enable detection on short audio segments.

Abstract

In this paper we present our system for the detection and classification of acoustic scenes and events (DCASE) 2020 Challenge Task 4: Sound event detection and separation in domestic environments. We introduce two new models: the forward-backward convolutional recurrent neural network (FBCRNN) and the tag-conditioned convolutional neural network (CNN). The FBCRNN employs two recurrent neural network (RNN) classifiers sharing the same CNN for preprocessing. With one RNN processing a recording in forward direction and the other in backward direction, the two networks are trained to jointly predict audio tags, i.e., weak labels, at each time step within a recording, given that at each time step they have jointly processed the whole recording. The proposed training encourages the classifiers to tag events as soon as possible. Therefore, after training, the networks can be applied to shorter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fgnt/pb_sed
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.