Consistent ICA: Determined BSS meets spectrogram consistency
Kohei Yatabe

TL;DR
This paper introduces a novel approach to multichannel audio blind source separation that leverages spectrogram consistency to address the permutation problem inherent in frequency-wise filtering.
Contribution
It demonstrates that spectrogram consistency can be used to improve source alignment in determined BSS, offering a new perspective on solving the permutation problem.
Findings
Spectrogram consistency aids in solving the permutation problem.
The proposed method improves source separation accuracy.
Spectrogram consistency provides a new tool for BSS algorithms.
Abstract
Multichannel audio blind source separation (BSS) in the determined situation (the number of microphones is equal to that of the sources), or determined BSS, is performed by multichannel linear filtering in the time-frequency domain to handle the convolutive mixing process. Ordinarily, the filter treats each frequency independently, which causes the well-known permutation problem, i.e., the problem of how to align the frequency-wise filters so that each separated component is correctly assigned to the corresponding sources. In this paper, it is shown that the general property of the time-frequency-domain representation called spectrogram consistency can be an assistant for solving the permutation problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
