Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music Scores
Jingjing Tang, Erica Cooper, Xin Wang, Junichi Yamagishi, George, Fazekas

TL;DR
This paper introduces an integrated system that converts symbolic music scores into expressive piano performances, combining a Transformer-based rendering model with a neural MIDI synthesiser to produce realistic and expressive audio outputs.
Contribution
It presents the first streamlined method for transforming score MIDI files without expression control into rich, expressive piano performances using a combined neural approach.
Findings
Accurately reconstructs human-like expressiveness.
Captures acoustic ambience of different environments.
Achieves musical expressiveness with high audio quality.
Abstract
This paper presents an integrated system that transforms symbolic music scores into expressive piano performance audio. By combining a Transformer-based Expressive Performance Rendering (EPR) model with a fine-tuned neural MIDI synthesiser, our approach directly generates expressive audio performances from score inputs. To the best of our knowledge, this is the first system to offer a streamlined method for converting score MIDI files lacking expression control into rich, expressive piano performances. We conducted experiments using subsets of the ATEPP dataset, evaluating the system with both objective metrics and subjective listening tests. Our system not only accurately reconstructs human-like expressiveness, but also captures the acoustic ambience of environments such as concert halls and recording studios. Additionally, the proposed system demonstrates its ability to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Neuroscience and Music Perception
