Guided learning for weakly-labeled semi-supervised sound event detection

Liwei Lin; Xiangdong Wang; Hong Liu; Yueliang Qian

arXiv:1906.02517·cs.LG·February 5, 2020·1 cites

Guided learning for weakly-labeled semi-supervised sound event detection

Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

PDF

Open Access 1 Repo

TL;DR

This paper introduces Guided Learning, a semi-supervised approach for sound event detection that uses a teacher-student model setup to improve boundary detection by leveraging weak labels and audio tagging performance.

Contribution

The paper presents a novel teacher-student framework that separates audio tagging and boundary detection, improving semi-supervised sound event detection without complex trade-offs.

Findings

01

Achieves competitive performance on DCASE2018 Task4 dataset

02

Effectively leverages unlabeled data for boundary detection

03

Demonstrates the benefit of separating sub-tasks in semi-supervised learning

Abstract

We propose a simple but efficient method termed Guided Learning for weakly-labeled semi-supervised sound event detection (SED). There are two sub-targets implied in weakly-labeled SED: audio tagging and boundary detection. Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data. The guidance is guaranteed by the audio tagging performance gap of the two models. In the meantime, the student model liberated from the trade-off is able to provide more excellent boundary detection results. We propose a principle to design such two models based on the relation between the temporal compression scale and the two sub-targets. We also propose an end-to-end semi-supervised learning process for these two models to enable their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kikyo-16/Sound_event_detection
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis