Contrastive Learning of Musical Representations
Janne Spijkervet, John Ashley Burgoyne

TL;DR
This paper introduces CLMR, a self-supervised contrastive learning framework for music representations that works on raw audio data, requiring no labels, and demonstrates strong transferability and data efficiency in music classification tasks.
Contribution
The paper presents CLMR, a novel contrastive learning approach for music that leverages new audio augmentations and achieves state-of-the-art results without labeled data.
Findings
Outperforms supervised models on MagnaTagATune
Achieves comparable results on Million Song dataset
Enables effective learning with only 1% labeled data
Abstract
While deep learning has enabled great advances in many areas of music, labeled music datasets remain especially hard, expensive, and time-consuming to create. In this work, we introduce SimCLR to the music domain and contribute a large chain of audio data augmentations to form a simple framework for self-supervised, contrastive learning of musical representations: CLMR. This approach works on raw time-domain music data and requires no labels to learn useful representations. We evaluate CLMR in the downstream task of music classification on the MagnaTagATune and Million Song datasets and present an ablation study to test which of our music-related innovations over SimCLR are most effective. A linear classifier trained on the proposed representations achieves a higher average precision than supervised models on the MagnaTagATune dataset, and performs comparably on the Million Song…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
MethodsContrastive Learning · 1x1 Convolution · Average Pooling · Bottleneck Residual Block · Global Average Pooling · Residual Connection · Convolution · Batch Normalization · Kaiming Initialization · Residual Block
