Separation of Moving Sound Sources Using Multichannel NMF and Acoustic Tracking
Joonas Nikunen, Aleksandr Diment, Tuomas Virtanen

TL;DR
This paper introduces a novel multichannel NMF-based method for separating moving sound sources by tracking and estimating source spectrograms, outperforming traditional approaches in real recordings.
Contribution
The paper presents a new multichannel NMF model with time-varying spatial covariance matrices for separating moving sound sources, incorporating source tracking and spectral estimation.
Findings
Outperforms conventional beamforming and ideal ratio mask methods in separation quality.
Effective in real recordings with multiple moving sources.
Robust against tracking errors when using ground truth trajectories.
Abstract
In this paper we propose a method for separation of moving sound sources. The method is based on first tracking the sources and then estimation of source spectrograms using multichannel non-negative matrix factorization (NMF) and extracting the sources from the mixture by single-channel Wiener filtering. We propose a novel multichannel NMF model with time-varying mixing of the sources denoted by spatial covariance matrices (SCM) and provide update equations for optimizing model parameters minimizing squared Frobenius norm. The SCMs of the model are obtained based on estimated directions of arrival of tracked sources at each time frame. The evaluation is based on established objective separation criteria and using real recordings of two and three simultaneous moving sound sources. The compared methods include conventional beamforming and ideal ratio mask separation. The proposed method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
