The Effect of Spectrogram Reconstruction on Automatic Music   Transcription: An Alternative Approach to Improve Transcription Accuracy

Kin Wai Cheuk; Yin-Jyun Luo; Emmanouil Benetos; Dorien Herremans

arXiv:2010.09969·cs.SD·October 21, 2020

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

Kin Wai Cheuk, Yin-Jyun Luo, Emmanouil Benetos, Dorien Herremans

PDF

2 Repos

TL;DR

This paper investigates how spectrogram reconstruction loss influences automatic music transcription models, demonstrating that it can improve note-level accuracy and frame-level precision without relying on supervised onset/offset sub-tasks.

Contribution

The study introduces a dual U-net architecture trained with spectrogram reconstruction loss, showing its effectiveness in enhancing transcription accuracy over models without reconstruction.

Findings

01

Reconstruction loss improves note-level transcription accuracy.

02

Reconstruction loss boosts frame-level precision beyond state-of-the-art.

03

Feature maps exhibit gridlike structures indicating counting along time and frequency.

Abstract

Most of the state-of-the-art automatic music transcription (AMT) models break down the main transcription task into sub-tasks such as onset prediction and offset prediction and train them with onset and offset labels. These predictions are then concatenated together and used as the input to train another model with the pitch labels to obtain the final transcription. We attempt to use only the pitch labels (together with spectrogram reconstruction loss) and explore how far this model can go without introducing supervised sub-tasks. In this paper, we do not aim at achieving state-of-the-art transcription accuracy, instead, we explore the effect that spectrogram reconstruction has on our AMT model. Our proposed model consists of two U-nets: the first U-net transcribes the spectrogram into a posteriorgram, and a second U-net transforms the posteriorgram back into a spectrogram. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution · Concatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · U-Net