SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away
Jiajia Li, Jiliang Hu, Ziyi Pan, Chong Chen, Zuchao Li, Ping Wang, Lefei Zhang

TL;DR
SongSong is a novel music generation model that restores ancient Chinese SongCi, combining melody prediction, voice and accompaniment generation, and dataset creation, achieving high-quality results for this traditional genre.
Contribution
We introduce SongSong, the first model capable of generating authentic Chinese SongCi music, along with the OpenSongSong dataset for training and evaluation.
Findings
SongSong outperforms existing platforms in quality and authenticity.
The OpenSongSong dataset contains 29.9 hours of ancient Chinese SongCi music.
Subjective and objective evaluations confirm SongSong's superior performance.
Abstract
Recently, there have been significant advancements in music generation. However, existing models primarily focus on creating modern pop songs, making it challenging to produce ancient music with distinct rhythms and styles, such as ancient Chinese SongCi. In this paper, we introduce SongSong, the first music generation model capable of restoring Chinese SongCi to our knowledge. Our model first predicts the melody from the input SongCi, then separately generates the singing voice and accompaniment based on that melody, and finally combines all elements to create the final piece of music. Additionally, to address the lack of ancient music datasets, we create OpenSongSong, a comprehensive dataset of ancient Chinese SongCi music, featuring 29.9 hours of compositions by various renowned SongCi music masters. To assess SongSong's proficiency in performing SongCi, we randomly select 85 SongCi…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Artificial Intelligence in Games
