TL;DR
This paper explores data augmentation techniques for improving instrument classification in polyphonic music using monophonic data, employing CNNs and ensemble methods to achieve over 80% LRAP accuracy.
Contribution
It introduces novel augmentation strategies like genre, pitch, and tempo synchronization for monophonic data to enhance polyphonic instrument classification performance.
Findings
Ensemble of classifiers improves accuracy.
Synchronization techniques enhance model performance.
Achieved over 80% LRAP on IRMAS dataset.
Abstract
Instrument classification is one of the fields in Music Information Retrieval (MIR) that has attracted a lot of research interest. However, the majority of that is dealing with monophonic music, while efforts on polyphonic material mainly focus on predominant instrument recognition. In this paper, we propose an approach for instrument classification in polyphonic music from purely monophonic data, that involves performing data augmentation by mixing different audio segments. A variety of data augmentation techniques focusing on different sonic aspects, such as overlaying audio segments of the same genre, as well as pitch and tempo-based synchronization, are explored. We utilize Convolutional Neural Networks for the classification task, comparing shallow to deep network architectures. We further investigate the usage of a combination of the above classifiers, each trained on a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
