DExter: Learning and Controlling Performance Expression with Diffusion Models
Huan Zhang, Shreyan Chowdhury, Carlos Eduardo Cancino-Chac\'on, Jinhua, Liang, Simon Dixon, Gerhard Widmer

TL;DR
DExter is a diffusion-based model that learns and controls expressive piano performances, enabling style transfer and perceptually guided variations, with promising quantitative and qualitative results.
Contribution
The paper introduces DExter, a novel diffusion model for expressive music performance rendering that incorporates perceptual features and enables style transfer.
Findings
Captures time-varying expressive parameters effectively
Performs well in subjective performance quality evaluations
Enables perceptually guided performance generation and style transfer
Abstract
In the pursuit of developing expressive music performance models using artificial intelligence, this paper introduces DExter, a new approach leveraging diffusion probabilistic models to render Western classical piano performances. In this approach, performance parameters are represented in a continuous expression space and a diffusion model is trained to predict these continuous parameters while being conditioned on the musical score. Furthermore, DExter also enables the generation of interpretations (expressive variations of a performance) guided by perceptually meaningful features by conditioning jointly on score and perceptual feature representations. Consequently, we find that our model is useful for learning expressive performance, generating perceptually steered performances, and transferring performance styles. We assess the model through quantitative and qualitative analyses,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics
MethodsDiffusion
