TL;DR
This paper explores direct performance generation in music, focusing on producing expressive timing and dynamics alongside notes, using an LSTM model and professional feedback to evaluate its effectiveness.
Contribution
It introduces a novel approach to music generation that emphasizes expressive performance aspects and discusses data requirements for training such models.
Findings
LSTM-based model produces expressive musical performances
Generated examples are subjectively evaluated as effective
Professional feedback supports the model's musical expressiveness
Abstract
Music generation has generally been focused on either creating scores or interpreting them. We discuss differences between these two problems and propose that, in fact, it may be valuable to work in the space of direct generation: jointly predicting the notes their expressive timing and dynamics. We consider the significance and qualities of the data set needed for this. Having identified both a problem domain and characteristics of an appropriate data set, we show an LSTM-based recurrent network model that subjectively performs quite well on this task. Critically, we provide generated examples. We also include feedback from professional composers and musicians about some of these examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
