Agent-Driven Large Language Models for Mandarin Lyric Generation
Hong-Hsiang Liu, Yi-Wen Liu

TL;DR
This paper introduces a multi-agent system leveraging large language models to improve Mandarin lyric generation by addressing melody-lyric alignment and creativity, validated through listening tests with a singing voice synthesizer.
Contribution
It proposes a novel multi-agent framework for Mandarin lyric generation that decomposes the task into specialized sub-agents, enhancing lyric-melody fit and creative quality.
Findings
Validated on Mpop600 dataset, confirming lyricists consider lyric-melody fit.
Listening tests show improved lyric quality with the multi-agent system.
Demonstrated effectiveness of agent collaboration in lyric generation.
Abstract
Generative Large Language Models have shown impressive in-context learning abilities, performing well across various tasks with just a prompt. Previous melody-to-lyric research has been limited by scarce high-quality aligned data and unclear standard for creativeness. Most efforts focused on general themes or emotions, which are less valuable given current language model capabilities. In tonal contour languages like Mandarin, pitch contours are influenced by both melody and tone, leading to variations in lyric-melody fit. Our study, validated by the Mpop600 dataset, confirms that lyricists and melody writers consider this fit during their composition process. In this research, we developed a multi-agent system that decomposes the melody-to-lyric task into sub-tasks, with each agent controlling rhyme, syllable count, lyric-melody alignment, and consistency. Listening tests were conducted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
