Lyric document embeddings for music tagging

Matt McVicar; Bruno Di Giorgi; Baris Dundar; Matthias Mauch

arXiv:2112.11436·cs.CL·December 22, 2021·1 cites

Lyric document embeddings for music tagging

Matt McVicar, Bruno Di Giorgi, Baris Dundar, Matthias Mauch

PDF

Open Access

TL;DR

This paper investigates various methods for creating fixed-dimensional lyric embeddings for music tagging, finding that simple averaging of pretrained embeddings often outperforms complex neural models across multiple tasks.

Contribution

It provides an extensive empirical comparison of token-level and document-level lyric embedding methods on a large-scale dataset for music tagging.

Findings

01

Averaging pretrained embeddings outperforms complex neural architectures in many tasks.

02

Simple methods are competitive with or better than advanced models.

03

The study covers diverse tagging tasks like genre, explicit content, and era detection.

Abstract

We present an empirical study on embedding the lyrics of a song into a fixed-dimensional feature for the purpose of music tagging. Five methods of computing token-level and four methods of computing document-level representations are trained on an industrial-scale dataset of tens of millions of songs. We compare simple averaging of pretrained embeddings to modern recurrent and attention-based neural architectures. Evaluating on a wide range of tagging tasks such as genre classification, explicit content identification and era detection, we find that averaging word embeddings outperform more complex architectures in many downstream metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies