On Ambisonic Source Separation with Spatially Informed Non-negative   Tensor Factorization

Mateusz Guzik; Konrad Kowalczyk

arXiv:2501.10305·eess.AS·January 20, 2025·IEEE ACM Trans. Audio Speech Lang. Process.

On Ambisonic Source Separation with Spatially Informed Non-negative Tensor Factorization

Mateusz Guzik, Konrad Kowalczyk

PDF

1 Repo

TL;DR

This paper introduces a novel non-negative tensor factorization method for sound source separation in Ambisonic recordings, leveraging spatial prior knowledge and multiple cost functions, outperforming existing techniques across various scenarios.

Contribution

It develops four algorithms based on different cost functions and priors, integrating spatial information into source separation within a MAP framework, with extensive experimental validation.

Findings

01

Proposed MAP methods outperform baseline ML and other techniques in separation quality.

02

Algorithms perform well across different source counts, reverberation levels, and prior knowledge accuracy.

03

Superior objective metrics (SDR, ISR, SIR, SAR) demonstrate the effectiveness of the approach.

Abstract

This article presents a Non-negative Tensor Factorization based method for sound source separation from Ambisonic microphone signals. The proposed method enables the use of prior knowledge about the Directions-of-Arrival (DOAs) of the sources, incorporated through a constraint on the Spatial Covariance Matrix (SCM) within a Maximum a Posteriori (MAP) framework. Specifically, this article presents a detailed derivation of four algorithms that are based on two types of cost functions, namely the squared Euclidean distance and the Itakura-Saito divergence, which are then combined with two prior probability distributions on the SCM, that is the Wishart and the Inverse Wishart. The experimental evaluation of the baseline Maximum Likelihood (ML) and the proposed MAP methods is primarily based on first-order Ambisonic recordings, using four different source signal datasets, three with musical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

metlosz/ambisonic_spatially_informed_ntf
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.