Melody-Guided Music Generation

Shaopeng Wei; Manzhen Wei; Haoyu Wang; Yu Zhao; Gang Kou

arXiv:2409.20196·cs.SD·December 31, 2024

Melody-Guided Music Generation

Shaopeng Wei, Manzhen Wei, Haoyu Wang, Yu Zhao, Gang Kou

PDF

Open Access 1 Repo 1 Models

TL;DR

The paper introduces MG2, a melody-guided text-to-music generation model that outperforms existing models with fewer resources by aligning text, audio, and melody using contrastive pretraining and a retrieval-augmented diffusion process.

Contribution

It proposes a novel contrastive language-music pretraining method and a melody-guided diffusion model for efficient, high-quality text-to-music generation with limited data and parameters.

Findings

01

MG2 surpasses current open-source models in quality.

02

Achieves high performance with less than 1/3 parameters of competitors.

03

Human evaluations confirm practical effectiveness.

Abstract

We present the Melody-Guided Music Generation (MG2) model, a novel approach using melody to guide the text-to-music generation that, despite a simple method and limited resources, achieves excellent performance. Specifically, we first align the text with audio waveforms and their associated melodies using the newly proposed Contrastive Language-Music Pretraining, enabling the learned text representation fused with implicit melody information. Subsequently, we condition the retrieval-augmented diffusion module on both text prompt and retrieved melody. This allows MG2 to generate music that reflects the content of the given text description, meantime keeping the intrinsic harmony under the guidance of explicit melody information. We conducted extensive experiments on two public datasets: MusicCaps and MusicBench. Surprisingly, the experimental results demonstrate that the proposed MG2…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shaopengw/Awesome-Music-Generation
pytorchOfficial

Models

🤗
ManzhenWei/MG2
model· ♡ 12
♡ 12

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDiverse Music Education Insights · Music Technology and Sound Studies · Music History and Culture

MethodsDiffusion · ALIGN