User Specific Adaptation in Automatic Transcription of Vocalised Percussion
Ant\'onio Ramires, Rui Penha, Matthew E. P. Davies

TL;DR
This paper introduces LVT, a user-adaptive system for transcribing vocalised percussion sounds into drum patterns within DAWs, utilizing machine learning and feature selection for personalized accuracy.
Contribution
It presents a novel user-specific vocal percussion transcription system with feature selection, improving personalization in drum sound recognition.
Findings
Effective user-specific adaptation improves transcription accuracy.
The system successfully classifies vocalised drum sounds across different users.
Feature selection enhances the relevance of extracted audio features.
Abstract
The goal of this work is to develop an application that enables music producers to use their voice to create drum patterns when composing in Digital Audio Workstations (DAWs). An easy-to-use and user-oriented system capable of automatically transcribing vocalisations of percussion sounds, called LVT - Live Vocalised Transcription, is presented. LVT is developed as a Max for Live device which follows the `segment-and-classify' methodology for drum transcription, and includes three modules: i) an onset detector to segment events in time; ii) a module that extracts relevant features from the audio content; and iii) a machine-learning component that implements the k-Nearest Neighbours (kNN) algorithm for the classification of vocalised drum timbres. Due to the wide differences in vocalisations from distinct users for the same drum sound, a user-specific approach to vocalised transcription…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
