Automatic Instrument Recognition in Polyphonic Music Using Convolutional Neural Networks
Peter Li, Jiyuan Qian, Tian Wang

TL;DR
This paper demonstrates that convolutional neural networks trained end-to-end on raw audio can outperform traditional feature-engineering methods in automatic musical instrument recognition within polyphonic music.
Contribution
It introduces an end-to-end CNN approach for instrument recognition, eliminating the need for manual feature engineering and domain-specific knowledge.
Findings
CNN achieves higher accuracy than traditional methods
End-to-end training simplifies the pipeline
Raw audio input is effective for instrument classification
Abstract
Traditional methods to tackle many music information retrieval tasks typically follow a two-step architecture: feature engineering followed by a simple learning algorithm. In these "shallow" architectures, feature engineering and learning are typically disjoint and unrelated. Additionally, feature engineering is difficult, and typically depends on extensive domain expertise. In this paper, we present an application of convolutional neural networks for the task of automatic musical instrument identification. In this model, feature extraction and learning algorithms are trained together in an end-to-end fashion. We show that a convolutional neural network trained on raw audio can achieve performance surpassing traditional methods that rely on hand-crafted features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
