PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
Qijun Gan, Song Wang, Shengtao Wu, Jianke Zhu

TL;DR
This paper introduces PianoMotion10M, a large annotated dataset of piano hand movements and a benchmark for generating realistic hand motions from audio, aiming to improve piano instruction systems.
Contribution
It provides a comprehensive dataset and evaluation framework for hand motion generation in piano performance, addressing a gap in music instruction AI tools.
Findings
Collected 116 hours of annotated piano videos with 10 million hand poses.
Developed a baseline model that generates hand motions from audio.
Designed evaluation metrics for motion quality and accuracy.
Abstract
Recently, artificial intelligence techniques for education have been received increasing attentions, while it still remains an open problem to design the effective music instrument instructing systems. Although key presses can be directly derived from sheet music, the transitional movements among key presses require more extensive guidance in piano performance. In this work, we construct a piano-hand motion generation benchmark to guide hand movements and fingerings for piano playing. To this end, we collect an annotated dataset, PianoMotion10M, consisting of 116 hours of piano playing videos from a bird's-eye view with 10 million annotated hand poses. We also introduce a powerful baseline model that generates hand motions from piano audios through a position predictor and a position-guided gesture generator. Furthermore, a series of evaluation metrics are designed to assess the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Human Motion and Animation · Music and Audio Processing
