Pseudo strong labels for large scale weakly supervised audio tagging
Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang

TL;DR
This paper introduces pseudo strong labels (PSL), a label augmentation method that improves large-scale weakly supervised audio tagging by using a machine annotator to generate finer supervision, leading to better performance and generalization.
Contribution
The paper proposes PSL, a novel framework that enhances supervision quality in weakly supervised audio tagging by leveraging a machine annotator for label augmentation.
Findings
Achieved an mAP of 35.95 on Audioset with PSL.
PSL mitigates missing labels in weakly supervised datasets.
Models trained with PSL generalize better to FSD datasets.
Abstract
Large-scale audio tagging datasets inevitably contain imperfect labels, such as clip-wise annotated (temporally weak) tags with no exact on- and offsets, due to a high manual labeling cost. This work proposes pseudo strong labels (PSL), a simple label augmentation framework that enhances the supervision quality for large-scale weakly supervised audio tagging. A machine annotator is first trained on a large weakly supervised dataset, which then provides finer supervision for a student model. Using PSL we achieve an mAP of 35.95 balanced train subset of Audioset using a MobileNetV2 back-end, significantly outperforming approaches without PSL. An analysis is provided which reveals that PSL mitigates missing labels. Lastly, we show that models trained with PSL are also superior at generalizing to the Freesound datasets (FSD) than their weakly trained counterparts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
