FMA: A Dataset For Music Analysis
Micha\"el Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier, Bresson

TL;DR
The FMA dataset offers a large, open collection of high-quality music audio, metadata, and features to facilitate research in music information retrieval, addressing the scarcity of extensive datasets in the field.
Contribution
This paper introduces the comprehensive FMA dataset, including its creation process, structure, and baseline evaluations for MIR tasks, enhancing resources for music analysis research.
Findings
Provides 917 GiB of audio data from over 106,000 tracks
Includes pre-computed features and detailed metadata
Evaluates baseline genre recognition performance
Abstract
We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
