Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data
Dezhi Wang, Lilun Zhang, Changchun Bao, Kele Xu, Boqing Zhu, Qiuqiang, Kong

TL;DR
This paper introduces a weakly supervised CRNN-based sound event detection system that leverages large-scale unlabeled in-domain data with predicted labels, improving detection performance while reducing labeling costs.
Contribution
It proposes a novel joint framework combining a general audio tagging model with a weakly supervised CRNN, effectively utilizing unlabeled data for sound event detection.
Findings
Performance increases with more unlabeled data
Ensemble strategy improves robustness against noisy labels
Achieves 21.0% F1-score on DCASE 2018 dataset
Abstract
Sound event detection (SED) is typically posed as a supervised learning problem requiring training data with strong temporal labels of sound events. However, the production of datasets with strong labels normally requires unaffordable labor cost. It limits the practical application of supervised SED methods. The recent advances in SED approaches focuses on detecting sound events by taking advantages of weakly labeled or unlabeled training data. In this paper, we propose a joint framework to solve the SED task using large-scale unlabeled in-domain data. In particular, a state-of-the-art general audio tagging model is first employed to predict weak labels for unlabeled data. On the other hand, a weakly supervised architecture based on the convolutional recurrent neural network (CRNN) is developed to solve the strong annotations of sound events with the aid of the unlabeled data with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Acoustic Wave Phenomena Research
