Self-Transriber: Few-shot Lyrics Transcription with Self-training
Xiaoxue Gao, Xianghu Yue, Haizhou Li

TL;DR
Self-Transcriber introduces a semi-supervised method for lyrics transcription that effectively leverages unlabeled singing data with minimal labeled data, achieving competitive results with significantly less supervision.
Contribution
This work is the first to apply semi-supervised learning with self-training to lyrics transcription, reducing the need for extensive labeled datasets.
Findings
Achieves competitive performance with only 12.7 hours of labeled data.
Outperforms supervised methods trained on much larger datasets.
Demonstrates effectiveness of self-training in lyrics transcription.
Abstract
The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive. How to benefit from unlabeled data and alleviate limited data problem have not been explored for lyrics transcription. We propose the first semi-supervised lyrics transcription paradigm, Self-Transcriber, by leveraging on unlabeled data using self-training with noisy student augmentation. We attempt to demonstrate the possibility of lyrics transcription with a few amount of labeled data. Self-Transcriber generates pseudo labels of the unlabeled singing using teacher model, and augments pseudo-labels to the labeled data for student model update with both self-training and supervised training losses. This work closes the gap between supervised and semi-supervised learning as well as opens doors for few-shot learning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Natural Language Processing Techniques
MethodsRandAugment · Dropout · Stochastic Depth · Noisy Student
