Semi-Supervised Contrastive Learning of Musical Representations
Julien Guinot, Elio Quinton, Gy\"orgy Fazekas

TL;DR
This paper introduces SemiSupCon, a semi-supervised contrastive learning method that incorporates musical domain knowledge into representations, improving downstream music information retrieval tasks with limited labeled data.
Contribution
It presents a novel semi-supervised contrastive learning framework that combines supervised and self-supervised objectives for better musical representations.
Findings
Improves downstream MIR task performance with moderate labeled data.
Enhances robustness to audio corruptions.
Shows strong transfer learning on related musical tasks.
Abstract
Despite the success of contrastive learning in Music Information Retrieval, the inherent ambiguity of contrastive self-supervision presents a challenge. Relying solely on augmentation chains and self-supervised positive sampling strategies can lead to a pretraining objective that does not capture key musical information for downstream tasks. We introduce semi-supervised contrastive learning (SemiSupCon), a simple method for leveraging musically informed labeled data (supervision signals) in the contrastive learning of musical representations. Our approach introduces musically relevant supervision signals into self-supervised contrastive learning by combining supervised and self-supervised contrastive objectives in a simpler framework than previous approaches. This framework improves downstream performance and robustness to audio corruptions on a range of downstream MIR tasks with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
MethodsContrastive Learning
