The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use
Bob L. Sturm

TL;DR
The paper critically examines the GTZAN dataset used in music genre recognition research, revealing its faults and analyzing their impact on system evaluation, emphasizing cautious use rather than discarding it.
Contribution
It provides a detailed catalog of GTZAN's faults, analyzes their effects on MGR system evaluation, and challenges assumptions about the dataset's reliability and comparability of results.
Findings
GTZAN contains repetitions, mislabelings, and distortions.
Faults in GTZAN affect evaluation of MGR systems differently.
Few studies have considered the impact of these faults on results.
Abstract
The GTZAN dataset appears in at least 100 published works, and is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). Our recent work, however, shows GTZAN has several faults (repetitions, mislabelings, and distortions), which challenge the interpretability of any result derived using it. In this article, we disprove the claims that all MGR systems are affected in the same ways by these faults, and that the performances of MGR systems in GTZAN are still meaningfully comparable since they all face the same faults. We identify and analyze the contents of GTZAN, and provide a catalog of its faults. We review how GTZAN has been used in MGR research, and find few indications that its faults have been known and considered. Finally, we rigorously study the effects of its faults on evaluating five different MGR systems. The lesson is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
