End-to-End Real-World Polyphonic Piano Audio-to-Score Transcription with Hierarchical Decoding
Wei Zeng, Xian He, Ye Wang

TL;DR
This paper introduces an end-to-end hierarchical Seq2Seq model for real-world polyphonic piano audio-to-score transcription, effectively capturing score details at multiple levels and bridging synthetic and real recordings through a two-stage training scheme.
Contribution
It proposes a novel hierarchical decoding approach with multi-task learning and a two-stage training process to improve real-world piano transcription accuracy.
Findings
Outperforms current state-of-the-art on synthetic data
Successfully transcribes human recordings with improved accuracy
Effectively captures score information at both bar and note levels
Abstract
Piano audio-to-score transcription (A2S) is an important yet underexplored task with extensive applications for music composition, practice, and analysis. However, existing end-to-end piano A2S systems faced difficulties in retrieving bar-level information such as key and time signatures, and have been trained and evaluated with only synthetic data. To address these limitations, we propose a sequence-to-sequence (Seq2Seq) model with a hierarchical decoder that aligns with the hierarchical structure of musical scores, enabling the transcription of score information at both the bar and note levels by multi-task learning. To bridge the gap between synthetic data and recordings of human performance, we propose a two-stage training scheme, which involves pre-training the model using an expressive performance rendering (EPR) system on synthetic audio, followed by fine-tuning the model using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
