Empirical Bayesian Independent Deeply Learned Matrix Analysis For   Multichannel Audio Source Separation

Takuya Hasumi; Tomohiko Nakamura; Norihiro Takamune; Hiroshi; Saruwatari; Daichi Kitamura; Yu Takahashi; Kazunobu Kondo

arXiv:2106.03492·cs.SD·June 8, 2021

Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation

Takuya Hasumi, Tomohiko Nakamura, Norihiro Takamune, Hiroshi, Saruwatari, Daichi Kitamura, Yu Takahashi, Kazunobu Kondo

PDF

Open Access

TL;DR

This paper introduces EB-IDLMA, an extension of IDLMA that incorporates a prior distribution for source power spectrograms, improving multichannel audio source separation by accounting for the reliability of spectrogram estimates.

Contribution

It proposes a novel empirical Bayesian framework for IDLMA, modeling source power spectrograms as latent variables to enhance separation performance.

Findings

01

EB-IDLMA outperforms traditional IDLMA in experiments.

02

Incorporating prior distributions improves source separation accuracy.

03

Reliability modeling of spectrogram estimates is crucial for performance.

Abstract

Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art supervised multichannel audio source separation methods. It blindly estimates the demixing filters on the basis of source independence, using the source model estimated by the deep neural network (DNN). However, since the ratios of the source to interferer signals vary widely among time-frequency (TF) slots, it is difficult to obtain reliable estimated power spectrograms of sources at all TF slots. In this paper, we propose an IDLMA extension, empirical Bayesian IDLMA (EB-IDLMA), by introducing a prior distribution of source power spectrograms and treating the source power spectrograms as latent random variables. This treatment allows us to implicitly consider the reliability of the estimated source power spectrograms for the estimation of demixing filters through the hyperparameters of the prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques