A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Shulei Ji, Jing Luo, Xinyu Yang

TL;DR
This survey comprehensively reviews deep learning methods for multi-level music generation, covering data, representations, evaluation, challenges, and future prospects in the field.
Contribution
It provides a detailed overview of the evolution, models, datasets, and evaluation techniques across different stages of deep music generation, highlighting gaps and future directions.
Findings
Summarizes current deep learning models for music generation
Analyzes datasets and representations used in the field
Discusses challenges and future research directions
Abstract
The utilization of deep learning techniques in generating various contents (such as image, text, etc.) has become a trend. Especially music, the topic of this paper, has attracted widespread attention of countless researchers.The whole process of producing music can be divided into three stages, corresponding to the three levels of music generation: score generation produces scores, performance generation adds performance characteristics to the scores, and audio generation converts scores with performance characteristics into audio by assigning timbre or generates music in audio format directly. Previous surveys have explored the network models employed in the field of automatic music generation. However, the development history, the model evolution, as well as the pros and cons of same music generation task have not been clearly illustrated. This paper attempts to provide an overview…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
