TL;DR
This paper introduces a convolutional neural network framework for identifying predominant musical instruments in polyphonic music, demonstrating improved accuracy over traditional methods through extensive experiments and optimized parameters.
Contribution
It presents a novel CNN-based approach for instrument recognition in polyphonic music, including effective aggregation methods and parameter optimization strategies.
Findings
CNN outperforms traditional spectral feature methods
Achieved 19.6% and 16.4% performance improvements over state-of-the-art
Optimal parameters include specific window size and activation functions
Abstract
Identifying musical instruments in polyphonic music recordings is a challenging but important problem in the field of music information retrieval. It enables music search by instrument, helps recognize musical genres, or can make music transcription easier and more accurate. In this paper, we present a convolutional neural network framework for predominant instrument recognition in real-world polyphonic music. We train our network from fixed-length music excerpts with a single-labeled predominant instrument and estimate an arbitrary number of predominant instruments from an audio signal with a variable length. To obtain the audio-excerpt-wise result, we aggregate multiple outputs from sliding windows over the test audio. In doing so, we investigated two different aggregation methods: one takes the average for each instrument and the other takes the instrument-wise sum followed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
