Multichannel Audio Source Separation with Independent Deeply Learned Matrix Analysis Using Product of Source Models
Takuya Hasumi, Tomohiko Nakamura, Norihiro Takamune, Hiroshi, Saruwatari, Daichi Kitamura, Yu Takahashi, Kazunobu Kondo

TL;DR
This paper introduces a novel multichannel audio source separation method that combines deep neural network-based and NMF-based source models using a product-of-expert approach, improving robustness to timbral mismatches.
Contribution
It extends IDLMA by integrating NMF-based source models with deep learning models through the product-of-source-models framework, enhancing separation performance.
Findings
Effective in separating sources with mismatched timbres
Improved separation accuracy over traditional IDLMA
Computationally efficient parameter estimation algorithm
Abstract
Independent deeply learned matrix analysis (IDLMA) is one of the state-of-the-art multichannel audio source separation methods using the source power estimation based on deep neural networks (DNNs). The DNN-based power estimation works well for sounds having timbres similar to the DNN training data. However, the sounds to which IDLMA is applied do not always have such timbres, and the timbral mismatch causes the performance degradation of IDLMA. To tackle this problem, we focus on a blind source separation counterpart of IDLMA, independent low-rank matrix analysis. It uses nonnegative matrix factorization (NMF) as the source model, which can capture source spectral components that only appear in the target mixture, using the low-rank structure of the source spectrogram as a clue. We thus extend the DNN-based source model to encompass the NMF-based source model on the basis of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques
