Syllable-level lyrics generation from melody exploiting character-level   language model

Zhe Zhang; Karol Lasocki; Yi Yu; Atsuhiro Takasu

arXiv:2310.00863·cs.CL·January 31, 2024

Syllable-level lyrics generation from melody exploiting character-level language model

Zhe Zhang, Karol Lasocki, Yi Yu, Atsuhiro Takasu

PDF

Open Access

TL;DR

This paper presents a method for generating syllable-level lyrics from melodies by fine-tuning character-level language models and integrating linguistic knowledge into the generation process, improving coherence without new model training.

Contribution

It introduces a novel approach that fine-tunes existing character-level language models for syllable-level lyrics generation from melodies, incorporating linguistic knowledge into the decoding process.

Findings

01

Enhanced coherence and correctness of generated lyrics

02

Eliminates the need for training new language models

03

Effective use of ChatGPT-based evaluation and human assessment

Abstract

The generation of lyrics tightly connected to accompanying melodies involves establishing a mapping between musical notes and syllables of lyrics. This process requires a deep understanding of music constraints and semantic patterns at syllable-level, word-level, and sentence-level semantic meanings. However, pre-trained language models specifically designed at the syllable level are publicly unavailable. To solve these challenging issues, we propose to exploit fine-tuning character-level language models for syllable-level lyrics generation from symbolic melody. In particular, our method endeavors to incorporate linguistic knowledge of the language model into the beam search process of a syllable-level Transformer generator network. Additionally, by exploring ChatGPT-based evaluation for generated lyrics, along with human subjective evaluation, we demonstrate that our approach enhances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Adam · Residual Connection · Layer Normalization · Softmax