Do Music Source Separation Models Preserve Spatial Information in Binaural Audio?
Richa Namballa, Agnieszka Roginska, Magdalena Fuentes

TL;DR
This paper evaluates how well existing music source separation models preserve spatial information in binaural audio, revealing significant degradation and highlighting future research opportunities in immersive audio applications.
Contribution
It provides the first systematic assessment of MSS models' ability to retain spatial cues in binaural audio, using novel datasets and metrics.
Findings
Stereo MSS models fail to preserve spatial cues in binaural audio.
Degradation varies with model architecture and instrument type.
Highlights opportunities for improving MSS in immersive audio contexts.
Abstract
Binaural audio remains underexplored within the music information retrieval community. Motivated by the rising popularity of virtual and augmented reality experiences as well as potential applications to accessibility, we investigate how well existing music source separation (MSS) models perform on binaural audio. Although these models process two-channel inputs, it is unclear how effectively they retain spatial information. In this work, we evaluate how several popular MSS models preserve spatial information on both standard stereo and novel binaural datasets. Our binaural data is synthesized using stems from MUSDB18-HQ and open-source head-related transfer functions by positioning instrument sources randomly along the horizontal plane. We then assess the spatial quality of the separated stems using signal processing and interaural cue-based metrics. Our results show that stereo MSS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing
