Syllable-level lyrics generation from melody exploiting character-level language model
Zhe Zhang, Karol Lasocki, Yi Yu, Atsuhiro Takasu

TL;DR
This paper presents a method for generating syllable-level lyrics from melodies by fine-tuning character-level language models and integrating linguistic knowledge into the generation process, improving coherence without new model training.
Contribution
It introduces a novel approach that fine-tunes existing character-level language models for syllable-level lyrics generation from melodies, incorporating linguistic knowledge into the decoding process.
Findings
Enhanced coherence and correctness of generated lyrics
Eliminates the need for training new language models
Effective use of ChatGPT-based evaluation and human assessment
Abstract
The generation of lyrics tightly connected to accompanying melodies involves establishing a mapping between musical notes and syllables of lyrics. This process requires a deep understanding of music constraints and semantic patterns at syllable-level, word-level, and sentence-level semantic meanings. However, pre-trained language models specifically designed at the syllable level are publicly unavailable. To solve these challenging issues, we propose to exploit fine-tuning character-level language models for syllable-level lyrics generation from symbolic melody. In particular, our method endeavors to incorporate linguistic knowledge of the language model into the beam search process of a syllable-level Transformer generator network. Additionally, by exploring ChatGPT-based evaluation for generated lyrics, along with human subjective evaluation, we demonstrate that our approach enhances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Adam · Residual Connection · Layer Normalization · Softmax
