Audio Source Separation in Reverberant Environments using $\beta$-divergence based Nonnegative Factorization
Mahmoud Fakhry, Piergiorgio Svaizer, and Maurizio Omologo

TL;DR
This paper introduces a novel nonnegative factorization approach using $eta$-divergence for multichannel audio source separation in reverberant environments, improving separation quality over existing methods.
Contribution
It proposes a new parameter estimation method based on nonnegative tensor factorization with $eta$-divergence, leveraging prior spectral information for enhanced separation.
Findings
Sparsity in factorization improves separation performance.
The method outperforms comparable algorithms in various mixing conditions.
Tuning $eta$ controls sparsity, impacting separation quality.
Abstract
In Gaussian model-based multichannel audio source separation, the likelihood of observed mixtures of source signals is parametrized by source spectral variances and by associated spatial covariance matrices. These parameters are estimated by maximizing the likelihood through an Expectation-Maximization algorithm and used to separate the signals by means of multichannel Wiener filtering. We propose to estimate these parameters by applying nonnegative factorization based on prior information on source variances. In the nonnegative factorization, spectral basis matrices can be defined as the prior information. The matrices can be either extracted or indirectly made available through a redundant library that is trained in advance. In a separate step, applying nonnegative tensor factorization, two algorithms are proposed in order to either extract or detect the basis matrices that best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
