Non-Local Musical Statistics as Guides for Audio-to-Score Piano   Transcription

Kentaro Shibata; Eita Nakamura; Kazuyoshi Yoshii

arXiv:2008.12710·cs.SD·April 6, 2021

Non-Local Musical Statistics as Guides for Audio-to-Score Piano Transcription

Kentaro Shibata, Eita Nakamura, Kazuyoshi Yoshii

PDF

TL;DR

This paper introduces a piano transcription system combining neural network pitch detection with statistical rhythm analysis, improving global musical characteristic estimation and achieving high accuracy in transcribing polyphonic piano music.

Contribution

It proposes a novel integration of deep learning and non-local statistical features to enhance global musical structure inference in automatic piano transcription.

Findings

01

Achieved 7.1% transcription error rate on a popular piano dataset.

02

Attained an 85.6% downbeat F-measure, demonstrating effective rhythm estimation.

03

Non-local statistics significantly improved global characteristic estimation.

Abstract

We present an automatic piano transcription system that converts polyphonic audio recordings into musical scores. This has been a long-standing problem of music information processing, and recent studies have made remarkable progress in the two main component techniques: multipitch detection and rhythm quantization. Given this situation, we study a method integrating deep-neural-network-based multipitch detection and statistical-model-based rhythm quantization. In the first part, we conducted systematic evaluations and found that while the present method achieved high transcription accuracies at the note level, some global characteristics of music, such as tempo scale, metre (time signature), and bar line positions, were often incorrectly estimated. In the second part, we formulated non-local statistics of pitch and rhythmic contents that are derived from musical knowledge and studied…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.