SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and   Multi-Task Pre-Training

Jiaxing Yu; Xinda Wu; Yunfei Xu; Tieyao Zhang; Songruoyao Wu; Le Ma,; Kejun Zhang

arXiv:2412.18107·eess.AS·December 25, 2024

SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training

Jiaxing Yu, Xinda Wu, Yunfei Xu, Tieyao Zhang, Songruoyao Wu, Le Ma,, Kejun Zhang

PDF

Open Access

TL;DR

SongGLM introduces a novel lyric-to-melody generation approach using 2D alignment encoding and multi-task pre-training, significantly improving alignment accuracy and harmonic quality in generated melodies.

Contribution

The paper presents a unified symbolic representation with 2D alignment encoding and a multi-task pre-training framework for improved lyric-melody generation.

Findings

01

Enhanced lyric-melody alignment accuracy

02

Improved harmonic consistency in generated melodies

03

Outperforms previous baseline methods in quality metrics

Abstract

Lyric-to-melody generation aims to automatically create melodies based on given lyrics, requiring the capture of complex and subtle correlations between them. However, previous works usually suffer from two main challenges: 1) lyric-melody alignment modeling, which is often simplified to one-syllable/word-to-one-note alignment, while others have the problem of low alignment accuracy; 2) lyric-melody harmony modeling, which usually relies heavily on intermediates or strict rules, limiting model's capabilities and generative diversity. In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies. Specifically, 1) we introduce a unified symbolic song representation for lyrics and melodies with word-level and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Human Motion and Animation