Adversarial Learning for Improved Onsets and Frames Music Transcription

Jong Wook Kim; Juan Pablo Bello

arXiv:1906.08512·cs.SD·June 21, 2019·21 cites

Adversarial Learning for Improved Onsets and Frames Music Transcription

Jong Wook Kim, Juan Pablo Bello

PDF

Open Access

TL;DR

This paper introduces an adversarial training scheme for music transcription that improves accuracy by modeling inter-label dependencies, outperforming existing state-of-the-art methods.

Contribution

It proposes a novel adversarial learning approach applied directly to time-frequency representations, enhancing transcription performance over traditional supervised models.

Findings

01

Significant reduction in error rates.

02

Improved frame-level and note-level metrics.

03

Enhanced confidence in model estimations.

Abstract

Automatic music transcription is considered to be one of the hardest problems in music information retrieval, yet recent deep learning approaches have achieved substantial improvements on transcription performance. These approaches commonly employ supervised learning models that predict various time-frequency representations, by minimizing element-wise losses such as the cross entropy function. However, applying the loss in this manner assumes conditional independence of each label given the input, and thus cannot accurately express inter-label dependencies. To address this issue, we introduce an adversarial training scheme that operates directly on the time-frequency representations and makes the output distribution closer to the ground-truth. Through adversarial learning, we achieve a consistent improvement in both frame-level and note-level metrics over Onsets and Frames, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Generative Adversarial Networks and Image Synthesis · Diverse Musicological Studies