Learning Features of Music from Scratch

John Thickstun; Zaid Harchaoui; Sham Kakade

arXiv:1611.09827·stat.ML·April 7, 2017·115 cites

Learning Features of Music from Scratch

John Thickstun, Zaid Harchaoui, Sham Kakade

PDF

Open Access 2 Repos

TL;DR

This paper presents MusicNet, a large-scale classical music dataset with annotations, and benchmarks various machine learning models for note prediction, demonstrating that end-to-end models learn frequency-specific audio features.

Contribution

Introduction of MusicNet, a comprehensive annotated music dataset, and evaluation of multiple machine learning architectures for music note prediction.

Findings

01

End-to-end models learn frequency selective filters.

02

Spectrogram-based models serve as baselines.

03

End-to-end neural nets outperform traditional methods.

Abstract

This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of supervision and evaluation of machine learning methods for music research. MusicNet consists of hundreds of freely-licensed classical music recordings by 10 composers, written for 11 instruments, together with instrument/note annotations resulting in over 1 million temporal labels on 34 hours of chamber music performances under various studio and microphone conditions. The paper defines a multi-label classification task to predict notes in musical recordings, along with an evaluation protocol, and benchmarks several machine learning architectures for this task: i) learning from spectrogram features; ii) end-to-end learning with a neural net; iii) end-to-end learning with a convolutional neural net. These experiments show that end-to-end models trained for note prediction learn frequency selective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing