Data Augmentation for Instrument Classification Robust to Audio Effects
Ant\'onio Ramires, Xavier Serra

TL;DR
This paper investigates how data augmentation with audio effects can improve the robustness of automatic instrument classification models used in electronic music production, focusing on classifying processed sounds in audio collections.
Contribution
It evaluates the impact of audio effect-based data augmentation on the robustness of a state-of-the-art instrument classification model.
Findings
Audio effects significantly influence classification accuracy.
Data augmentation improves robustness to processed sounds.
Certain effects reduce model performance more than others.
Abstract
Reusing recorded sounds (sampling) is a key component in Electronic Music Production (EMP), which has been present since its early days and is at the core of genres like hip-hop or jungle. Commercial and non-commercial services allow users to obtain collections of sounds (sample packs) to reuse in their compositions. Automatic classification of one-shot instrumental sounds allows automatically categorising the sounds contained in these collections, allowing easier navigation and better characterisation. Automatic instrument classification has mostly targeted the classification of unprocessed isolated instrumental sounds or detecting predominant instruments in mixed music tracks. For this classification to be useful in audio databases for EMP, it has to be robust to the audio effects applied to unprocessed sounds. In this paper we evaluate how a state of the art model trained with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
