Learned complex masks for multi-instrument source separation
Andreas Jansson, Rachel M. Bittner, Nicola Montecchio, Tillman Weyde

TL;DR
This paper introduces a method for music source separation that estimates complex masks directly in the time-frequency domain, improving separation quality by reducing phase artifacts compared to traditional magnitude-only masks.
Contribution
It extends complex mask estimation techniques from speech enhancement to multi-instrument music separation, addressing phase artifact issues and enhancing separation performance.
Findings
Complex masks outperform magnitude-only masks in separation quality.
The method reduces audible phase artifacts in separated sources.
Improved separation especially for overlapping instrument spectra.
Abstract
Music source separation in the time-frequency domain is commonly achieved by applying a soft or binary mask to the magnitude component of (complex) spectrograms. The phase component is usually not estimated, but instead copied from the mixture and applied to the magnitudes of the estimated isolated sources. While this method has several practical advantages, it imposes an upper bound on the performance of the system, where the estimated isolated sources inherently exhibit audible "phase artifacts". In this paper we address these shortcomings by directly estimating masks in the complex domain, extending recent work from the speech enhancement literature. The method is particularly well suited for multi-instrument musical source separation since residual phase artifacts are more pronounced for spectrally overlapping instrument sources, a common scenario in music. We show that complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques
