Music auto-tagging in the long tail: A few-shot approach
T. Aleksandra Ma, Alexander Lerch

TL;DR
This paper introduces a few-shot learning approach for music auto-tagging that leverages pre-trained features and a simple linear classifier, enabling effective tagging with minimal labeled data, especially for long-tail tags.
Contribution
It demonstrates that a lightweight linear probe using pre-trained features can achieve near state-of-the-art performance with significantly less training data in music auto-tagging.
Findings
Achieves performance close to state-of-the-art with only 20 samples per tag.
Performs competitively with full-data models when trained on entire dataset.
Effectively addresses long-tail tag auto-tagging with limited labeled data.
Abstract
In the realm of digital music, using tags to efficiently organize and retrieve music from extensive databases is crucial for music catalog owners. Human tagging by experts is labor-intensive but mostly accurate, whereas automatic tagging through supervised learning has approached satisfying accuracy but is restricted to a predefined set of training tags. Few-shot learning offers a viable solution to expand beyond this small set of predefined tags by enabling models to learn from only a few human-provided examples to understand tag meanings and subsequently apply these tags autonomously. We propose to integrate few-shot learning methodology into multi-label music auto-tagging by using features from pre-trained models as inputs to a lightweight linear classifier, also known as a linear probe. We investigate different popular pre-trained features, as well as different few-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
MethodsSparse Evolutionary Training
