LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language   Models

Haven Kim; Kahyun Choi

arXiv:2408.14750·cs.CL·August 28, 2024

LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language Models

Haven Kim, Kahyun Choi

PDF

Open Access 1 Repo

TL;DR

This paper presents LyCon, a method that uses large language models and metadata to reconstruct copyright-free lyrics from Bag-of-Words datasets, enabling lyric research without copyright issues.

Contribution

The study introduces a novel approach for generating lyrics from BoW datasets using metadata and large language models, and provides a publicly available dataset of reconstructed lyrics.

Findings

01

Successfully reconstructed lyrics aligned with metadata

02

Created a publicly accessible dataset of reconstructed lyrics

03

Enabled new research possibilities in lyric studies

Abstract

This paper addresses the unique challenge of conducting research in lyric studies, where direct use of lyrics is often restricted due to copyright concerns. Unlike typical data, internet-sourced lyrics are frequently protected under copyright law, necessitating alternative approaches. Our study introduces a novel method for generating copyright-free lyrics from publicly available Bag-of-Words (BoW) datasets, which contain the vocabulary of lyrics but not the lyrics themselves. Utilizing metadata associated with BoW datasets and large language models, we successfully reconstructed lyrics. We have compiled and made available a dataset of reconstructed lyrics, LyCon, aligned with metadata from renowned sources including the Million Song Dataset, Deezer Mood Detection Dataset, and AllMusic Genre Dataset, available for public access. We believe that the integration of metadata such as mood…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

havenpersona/lycon
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Natural Language Processing Techniques · Computational and Text Analysis Methods