TL;DR
This paper presents a phase-aware probabilistic model for monaural audio source separation that improves upon existing methods by incorporating phase structure through a Bayesian anisotropic Gaussian model and NMF, leading to better separation results.
Contribution
It introduces a novel complex ISNMF model with phase-aware anisotropic Gaussian sources and a Markov chain prior, enhancing audio separation performance.
Findings
Outperforms state-of-the-art phase-aware separation techniques
Effectively models phase structure using Bayesian anisotropic Gaussian sources
Improves energy preservation in source separation
Abstract
This paper introduces a phase-aware probabilistic model for audio source separation. Classical source models in the short-term Fourier transform domain use circularly-symmetric Gaussian or Poisson random variables. This is equivalent to assuming that the phase of each source is uniformly distributed, which is not suitable for exploiting the underlying structure of the phase. Drawing on preliminary works, we introduce here a Bayesian anisotropic Gaussian source model in which the phase is no longer uniform. Such a model permits us to favor a phase value that originates from a signal model through a Markov chain prior structure. The variance of the latent variables are structured with nonnegative matrix factorization (NMF). The resulting model is called complex Itakura-Saito NMF (ISNMF) since it generalizes the ISNMF model to the case of non-isotropic variables. It combines the advantages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
