PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination
Xuan Wang, Kai Ruan, Jiayi Han, Kaiyue Zhou, Gaoang Wang

TL;DR
PianoFlow is a novel flow-based framework that generates precise, music-aware bimanual piano motion in real-time, leveraging MIDI priors and role-aware attention for enhanced coordination.
Contribution
It introduces a flow-matching approach with MIDI priors, an asymmetric interaction module, and an autoregressive scheme for seamless, real-time piano motion synthesis.
Findings
Achieves superior quantitative and qualitative performance.
Accelerates inference by over 9 times compared to previous methods.
Enables real-time streaming for arbitrarily long sequences.
Abstract
Audio-driven bimanual piano motion generation requires precise modeling of complex musical structures and dynamic cross-hand coordination. However, existing methods often rely on acoustic-only representations lacking symbolic priors, employ inflexible interaction mechanisms, and are limited to computationally expensive short-sequence generation. To address these limitations, we propose PianoFlow, a flow-matching framework for precise and coordinated bimanual piano motion synthesis. Our approach strategically leverages MIDI as a privileged modality during training, distilling these structured musical priors to achieve deep semantic understanding while maintaining audio-only inference. Furthermore, we introduce an asymmetric role-gated interaction module to explicitly capture dynamic cross-hand coordination through role-aware attention and temporal gating. To enable real-time streaming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
