Unaligned Supervision For Automatic Music Transcription in The Wild

Ben Maman; Amit H. Bermano

arXiv:2204.13668·cs.SD·April 29, 2022·5 cites

Unaligned Supervision For Automatic Music Transcription in The Wild

Ben Maman, Amit H. Bermano

PDF

Open Access 1 Repo

TL;DR

This paper introduces NoteEM, a fully automated method for training music transcription models using unaligned, in-the-wild recordings, achieving state-of-the-art accuracy across diverse instruments without manual score alignment.

Contribution

NoteEM enables training on unaligned, real-world recordings with minimal human intervention, improving multi-instrument automatic music transcription accuracy and robustness.

Findings

01

Achieved state-of-the-art note-level accuracy on the MAPS dataset.

02

Demonstrated strong cross-dataset generalization.

03

Showed robustness with small, self-collected datasets.

Abstract

Multi-instrument Automatic Music Transcription (AMT), or the decoding of a musical recording into semantic musical content, is one of the holy grails of Music Information Retrieval. Current AMT approaches are restricted to piano and (some) guitar recordings, due to difficult data collection. In order to overcome data collection barriers, previous AMT approaches attempt to employ musical scores in the form of a digitized version of the same song or piece. The scores are typically aligned using audio features and strenuous human intervention to generate training labels. We introduce NoteEM, a method for simultaneously training a transcriber and aligning the scores to their corresponding performances, in a fully-automated process. Using this unaligned supervision scheme, complemented by pseudo-labels and pitch-shift augmentation, our method enables training on in-the-wild recordings with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

benadar293/benadar293.github.io
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies