The Spheres Dataset: Multitrack Orchestral Recordings for Music Source Separation and Information Retrieval
Jaime Garcia-Martinez, David Diaz-Guerra, John Anderson, Ricardo Falcon-Perez, Pablo Caba\~nas-Molero, Tuomas Virtanen, Julio J. Carabias-Orti, Pedro Vera-Candeas

TL;DR
The Spheres dataset provides multitrack orchestral recordings with detailed acoustic data, designed to advance machine learning research in music source separation and related MIR tasks in classical music.
Contribution
It introduces a comprehensive orchestral dataset with multi-microphone recordings, isolated stems, and room impulse responses for benchmarking and developing separation algorithms.
Findings
Baseline models show promising separation performance.
The dataset reveals challenges in complex orchestral source separation.
Acoustic analysis offers insights into recording space characteristics.
Abstract
This paper introduces The Spheres dataset, multitrack orchestral recordings designed to advance machine learning research in music source separation and related MIR tasks within the classical music domain. The dataset is composed of over one hour recordings of musical pieces performed by the Colibr\`i Ensemble at The Spheres recording studio, capturing two canonical works - Tchaikovsky's Romeo and Juliet and Mozart's Symphony No. 40 - along with chromatic scales and solo excerpts for each instrument. The recording setup employed 23 microphones, including close spot, main, and ambient microphones, enabling the creation of realistic stereo mixes with controlled bleeding and providing isolated stems for supervised training of source separation models. In addition, room impulse responses were estimated for each instrument position, offering valuable acoustic characterization of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
