Model-based STFT phase recovery for audio source separation
Paul Magron, Roland Badeau, Bertrand David

TL;DR
This paper introduces a novel iterative method for recovering phase in audio source separation by leveraging the temporal continuity of sinusoidal components, outperforming traditional Wiener filter approaches when magnitude estimates are accurate.
Contribution
It proposes a new phase recovery technique based on unwrapping and iterative minimization, improving upon existing methods in audio source separation.
Findings
Outperforms state-of-the-art Wiener filter in experiments
Utilizes sinusoidal phase unwrapping for better phase estimates
Enhances time-domain signal synthesis accuracy
Abstract
For audio source separation applications, it is common to estimate the magnitude of the short-time Fourier transform (STFT) of each source. In order to further synthesizing time-domain signals, it is necessary to recover the phase of the corresponding complex-valued STFT. Most authors in this field choose a Wiener-like filtering approach which boils down to using the phase of the original mixture. In this paper, a different standpoint is adopted. Many music events are partially composed of slowly varying sinusoids and the STFT phase increment over time of those frequency components takes a specific form. This allows phase recovery by an unwrapping technique once a short-term frequency estimate has been obtained. Herein, a novel iterative source separation procedure is proposed which builds upon these results. It consists in minimizing the mixing error by means of the auxiliary function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
