Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity
Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux

TL;DR
This paper introduces the Slakh dataset, a large, high-quality synthesized music dataset based on MIDI data, to facilitate research in music source separation and address data scarcity issues.
Contribution
The paper presents Slakh, a new large-scale synthesized dataset for music source separation, generated from MIDI data and virtual instruments, enabling more effective training and evaluation.
Findings
Slakh can effectively augment existing datasets for instrument separation.
The dataset contains 145 hours of high-quality synthesized mixtures.
Slakh opens new possibilities for data-intensive music analysis tasks.
Abstract
Music source separation performance has greatly improved in recent years with the advent of approaches based on deep learning. Such methods typically require large amounts of labelled training data, which in the case of music consist of mixtures and corresponding instrument stems. However, stems are unavailable for most commercial music, and only limited datasets have so far been released to the public. It can thus be difficult to draw conclusions when comparing various source separation methods, as the difference in performance may stem as much from better data augmentation techniques or training tricks to alleviate the limited availability of training data, as from intrinsically better model architectures and objective functions. In this paper, we present the synthesized Lakh dataset (Slakh) as a new tool for music source separation research. Slakh consists of high-quality renderings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
