Training chord recognition models on artificially generated audio
Martyna Majchrzak, Jacek Ma\'ndziuk

TL;DR
This paper investigates the use of artificially generated audio datasets to train Transformer-based models for chord recognition, demonstrating their potential to supplement or replace real data in music information retrieval tasks.
Contribution
It introduces the use of Artificial Audio Multitracks (AAM) for training chord recognition models and evaluates their effectiveness compared to real datasets.
Findings
Artificial datasets can improve model training when real data is scarce.
AAM can be used alone or to augment small real datasets for better performance.
Artificial data shows promise despite differences from human-composed music.
Abstract
One of the challenging problems in Music Information Retrieval is the acquisition of enough non-copyrighted audio recordings for model training and evaluation. This study compares two Transformer-based neural network models for chord sequence recognition in audio recordings and examines the effectiveness of using an artificially generated dataset for this purpose. The models are trained on various combinations of Artificial Audio Multitracks (AAM), Schubert's Winterreise Dataset, and the McGill Billboard Dataset and evaluated with three metrics: Root, MajMin and Chord Content Metric (CCM). The experiments prove that even though there are certainly differences in complexity and structure between artificially generated and human-composed music, the former can be useful in certain scenarios. Specifically, AAM can enrich a smaller training dataset of music composed by a human or can even be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
