SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation

Darius Petermann; Minje Kim

arXiv:2202.07523·eess.AS·February 16, 2022

SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation

Darius Petermann, Minje Kim

PDF

Open Access

TL;DR

SpaIn-Net leverages explicit spatial information via panning angles to enhance stereophonic music source separation, enabling better instrument disentanglement and user interaction robustness.

Contribution

The paper introduces spatially-informed conditioning mechanisms for music source separation, improving performance and user control over traditional location-agnostic models.

Findings

01

Improved separation by 1.8 dB SI-SDR with spatial conditioning

02

Enabled disentanglement of same-class instruments

03

Robustness to incorrect panning information

Abstract

With the recent advancements of data driven approaches using deep neural networks, music source separation has been formulated as an instrument-specific supervised problem. While existing deep learning models implicitly absorb the spatial information conveyed by the multi-channel input signals, we argue that a more explicit and active use of spatial information could not only improve the separation process but also provide an entry-point for many user-interaction based tools. To this end, we introduce a control method based on the stereophonic location of the sources of interest, expressed as the panning angle. We present various conditioning mechanisms, including the use of raw angle and its derived feature representations, and show that spatial information helps. Our proposed approaches improve the separation performance compared to location agnostic architectures by 1.8 dB SI-SDR in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Music Technology and Sound Studies