Learning Models for Query by Vocal Percussion: A Comparative Study
Alejandro Delgado, SkoT McDonald, Ning Xu, Charalampos Saitis, Mark, Sandler

TL;DR
This paper compares traditional and deep learning methods for retrieving drum sounds from vocal percussion, aiming to improve creative workflows for artists through effective, efficient, and stable query systems.
Contribution
It evaluates various machine learning strategies and data augmentation techniques for vocal percussion-based drum sound retrieval, providing insights into their effectiveness and stability.
Findings
Deep learning models outperform traditional algorithms in accuracy.
Data augmentation improves model generalisation.
Trade-offs exist between speed and accuracy in different models.
Abstract
The imitation of percussive sounds via the human voice is a natural and effective tool for communicating rhythmic ideas on the fly. Thus, the automatic retrieval of drum sounds using vocal percussion can help artists prototype drum patterns in a comfortable and quick way, smoothing the creative workflow as a result. Here we explore different strategies to perform this type of query, making use of both traditional machine learning algorithms and recent deep learning techniques. The main hyperparameters from the models involved are carefully selected by feeding performance metrics to a grid search algorithm. We also look into several audio data augmentation techniques, which can potentially regularise deep learning models and improve generalisation. We compare the final performances in terms of effectiveness (classification accuracy), efficiency (computational speed), stability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
