A Hybrid Recurrent Neural Network For Music Transcription

Siddharth Sigtia; Emmanouil Benetos; Nicolas Boulanger-Lewandowski,; Tillman Weyde; Artur S. d'Avila Garcez; Simon Dixon

arXiv:1411.1623·cs.LG·November 7, 2014

A Hybrid Recurrent Neural Network For Music Transcription

Siddharth Sigtia, Emmanouil Benetos, Nicolas Boulanger-Lewandowski,, Tillman Weyde, Artur S. d'Avila Garcez, Simon Dixon

PDF

Open Access

TL;DR

This paper introduces a hybrid RNN-based model that integrates music language models with acoustic classifiers to enhance automatic music transcription accuracy, demonstrating superior performance on piano datasets.

Contribution

The paper presents a novel generative architecture combining RNN-based music language models with acoustic classifiers for improved transcription accuracy.

Findings

01

The proposed model outperforms existing methods on the MAPS piano dataset.

02

Incorporating higher-level score information improves transcription performance.

03

Different neural network architectures for acoustic modeling were compared.

Abstract

We investigate the problem of incorporating higher-level symbolic score-like information into Automatic Music Transcription (AMT) systems to improve their performance. We use recurrent neural networks (RNNs) and their variants as music language models (MLMs) and present a generative architecture for combining these models with predictions from a frame level acoustic classifier. We also compare different neural network architectures for acoustic modeling. The proposed model computes a distribution over possible output sequences given the acoustic input signal and we present an algorithm for performing a global search for good candidate transcriptions. The performance of the proposed model is evaluated on piano music from the MAPS dataset and we observe that the proposed model consistently outperforms existing transcription methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing