Phase recovery in NMF for audio source separation: an insightful benchmark
Paul Magron, Roland Badeau, Bertrand David

TL;DR
This paper evaluates phase recovery methods in NMF-based audio source separation, highlighting the effectiveness of High Resolution NMF in capturing phase information and correlations over time.
Contribution
It provides a comprehensive benchmark comparing blind and supervised NMF-based phase reconstruction methods, emphasizing the potential of HRNMF.
Findings
HRNMF effectively captures phase and temporal correlations.
Supervised models outperform blind separation in phase recovery.
HRNMF shows promising results for audio source separation.
Abstract
Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In applications such as source separation, the phase recovery for each extracted component is a major issue since it often leads to audible artifacts. In this paper, we present a methodology for evaluating various NMF-based source separation techniques involving phase reconstruction. For each model considered, a comparison between two approaches (blind separation without prior information and oracle separation with supervised model learning) is performed, in order to inquire about the room for improvement for the estimation methods. Experimental results show that the High Resolution NMF (HRNMF) model is particularly promising, because it is able to take phases and correlations over time into account with a great expressive power.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Music and Audio Processing
