Deep convolutional neural networks for predominant instrument   recognition in polyphonic music

Yoonchang Han; Jaehun Kim; Kyogu Lee

arXiv:1605.09507·cs.SD·December 28, 2016

Deep convolutional neural networks for predominant instrument recognition in polyphonic music

Yoonchang Han, Jaehun Kim, Kyogu Lee

PDF

1 Repo

TL;DR

This paper introduces a convolutional neural network framework for identifying predominant musical instruments in polyphonic music, demonstrating improved accuracy over traditional methods through extensive experiments and optimized parameters.

Contribution

It presents a novel CNN-based approach for instrument recognition in polyphonic music, including effective aggregation methods and parameter optimization strategies.

Findings

01

CNN outperforms traditional spectral feature methods

02

Achieved 19.6% and 16.4% performance improvements over state-of-the-art

03

Optimal parameters include specific window size and activation functions

Abstract

Identifying musical instruments in polyphonic music recordings is a challenging but important problem in the field of music information retrieval. It enables music search by instrument, helps recognize musical genres, or can make music transcription easier and more accurate. In this paper, we present a convolutional neural network framework for predominant instrument recognition in real-world polyphonic music. We train our network from fixed-length music excerpts with a single-labeled predominant instrument and estimate an arbitrary number of predominant instruments from an audio signal with a variable length. To obtain the audio-excerpt-wise result, we aggregate multiple outputs from sliding windows over the test audio. In doing so, we investigated two different aggregation methods: one takes the average for each instrument and the other takes the instrument-wise sum followed by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iooops/CS221-Audio-Tagging
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.