Learning To Generate Piano Music With Sustain Pedals
Joann Ching, Yi-Hsuan Yang

TL;DR
This paper introduces a method to incorporate sustain pedal information into symbolic piano music generation using audio-derived pedal data and transformer models, highlighting the importance of pedal inclusion for realistic piano performance synthesis.
Contribution
It is the first to integrate inferred pedal data into a transformer-based symbolic music generation model, advancing the realism of generated piano performances.
Findings
Pedal information can be effectively inferred from audio recordings.
Inclusion of pedal tokens improves the musical realism in generated piano pieces.
The approach shows potential for enhancing future piano music generation models.
Abstract
Recent years have witnessed a growing interest in research related to the detection of piano pedals from audio signals in the music information retrieval community. However, to our best knowledge, recent generative models for symbolic music have rarely taken piano pedals into account. In this work, we employ the transcription model proposed by Kong et al. to get pedal information from the audio recordings of piano performance in the AILabs1k7 dataset, and then modify the Compound Word Transformer proposed by Hsiao et al. to build a Transformer decoder that generates pedal-related tokens along with other musical tokens. While the work is done by using inferred sustain pedal information as training data, the result shows hope for further improvement and the importance of the involvement of sustain pedal in tasks of piano performance generations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Label Smoothing · Dense Connections · Residual Connection · Layer Normalization · Absolute Position Encodings · Softmax · Adam
