An Empirical Evaluation of End-to-End Polyphonic Optical Music   Recognition

Sachinda Edirisooriya; Hao-Wen Dong; Julian McAuley; Taylor; Berg-Kirkpatrick

arXiv:2108.01769·cs.CV·August 5, 2021·6 cites

An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition

Sachinda Edirisooriya, Hao-Wen Dong, Julian McAuley, Taylor, Berg-Kirkpatrick

PDF

Open Access 1 Repo

TL;DR

This paper introduces new datasets and models for end-to-end polyphonic optical music recognition, achieving state-of-the-art results by treating the task as multi-sequence detection with novel decoder architectures.

Contribution

It presents two innovative formulations for polyphonic OMR and introduces the RNNDecoder, improving recognition accuracy on complex polyphonic scores.

Findings

01

RNNDecoder achieves state-of-the-art performance.

02

New datasets enable large-scale polyphonic OMR evaluation.

03

Multi-sequence detection outperforms previous methods.

Abstract

Previous work has shown that neural architectures are able to perform optical music recognition (OMR) on monophonic and homophonic music with high accuracy. However, piano and orchestral scores frequently exhibit polyphonic passages, which add a second dimension to the task. Monophonic and homophonic music can be described as homorhythmic, or having a single musical rhythm. Polyphonic music, on the other hand, can be seen as having multiple rhythmic sequences, or voices, concurrently. We first introduce a workflow for creating large-scale polyphonic datasets suitable for end-to-end recognition from sheet music publicly available on the MuseScore forum. We then propose two novel formulations for end-to-end polyphonic OMR -- one treating the problem as a type of multi-task binary classification, and the other treating it as multi-sequence detection. Building upon the encoder-decoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sachindae/polyphonic-omr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies