Semi-supervised Sound Event Detection using Random Augmentation and   Consistency Regularization

Xiaofei Li

arXiv:2102.00154·eess.AS·February 2, 2021

Semi-supervised Sound Event Detection using Random Augmentation and Consistency Regularization

Xiaofei Li

PDF

Open Access

TL;DR

This paper explores semi-supervised sound event detection by combining random audio augmentation with consistency regularization, demonstrating that the combination, especially with the MeanTeacher model, improves detection performance.

Contribution

It introduces an audio-signal random augmentation method and shows that combining it with consistency regularization enhances semi-supervised sound event detection.

Findings

01

Consistency regularization is effective for semi-supervised sound event detection.

02

Combining augmentation with the MeanTeacher model yields the best results.

03

The proposed methods scale well with unlabelled data.

Abstract

Sound event detection is a core module for acoustic environmental analysis. Semi-supervised learning technique allows to largely scale up the dataset without increasing the annotation budget, and recently attracts lots of research attention. In this work, we study on two advanced semi-supervised learning techniques for sound event detection. Data augmentation is important for the success of recent deep learning systems. This work studies the audio-signal random augmentation method, which provides an augmentation strategy that can handle a large number of different audio transformations. In addition, consistency regularization is widely adopted in recent state-of-the-art semi-supervised learning methods, which exploits the unlabelled data by constraining the prediction of different transformations of one sample to be identical to the prediction of this sample. This work finds that, for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis