SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source   Separation

Jaime Garcia-Martinez; David Diaz-Guerra; Archontis Politis; Tuomas; Virtanen; Julio J. Carabias-Orti; and Pedro Vera-Candeas

arXiv:2409.10995·eess.AS·February 18, 2025

SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation

Jaime Garcia-Martinez, David Diaz-Guerra, Archontis Politis, Tuomas, Virtanen, Julio J. Carabias-Orti, and Pedro Vera-Candeas

PDF

Open Access 2 Repos

TL;DR

This paper introduces SynthSOD, a new multitrack dataset for orchestra music source separation, addressing the lack of comprehensive datasets for extracting similar-sounding orchestral sources.

Contribution

SynthSOD is a novel, realistic, and heterogeneous dataset created using simulation techniques, enabling improved training for orchestra source separation models.

Findings

01

Baseline model trained on SynthSOD performs well on synthetic data.

02

Model generalizes to real-world orchestral recordings.

03

SynthSOD enhances the development of source separation techniques for orchestral music.

Abstract

Recent advancements in music source separation have significantly progressed, particularly in isolating vocals, drums, and bass elements from mixed tracks. These developments owe much to the creation and use of large-scale, multitrack datasets dedicated to these specific components. However, the challenge of extracting similarly sounding sources from orchestra recordings has not been extensively explored, largely due to a scarcity of comprehensive and clean (i.e bleed-free) multitrack datasets. In this paper, we introduce a novel multitrack dataset called SynthSOD, developed using a set of simulation techniques to create a realistic (i.e. using high-quality soundfonts), musically motivated, and heterogeneous training set comprising different dynamics, natural tempo changes, styles, and conditions. Moreover, we demonstrate the application of a widely used baseline music separation model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis

MethodsSparse Evolutionary Training