Switching Independent Vector Analysis and Its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms
Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi, Sawada, Naoyuki Kamo, Shoko Araki

TL;DR
This paper introduces switching IVA (swIVA) and its extension swCIVA, which improve blind source separation and dereverberation with fewer microphones by clustering signal frames and applying adaptive algorithms, outperforming traditional methods.
Contribution
The paper proposes a novel switching mechanism for IVA and extends it to convolutional beamforming, enabling effective source separation and dereverberation with limited microphones.
Findings
swIVA outperforms conventional IVA in low-microphone scenarios.
swCIVA effectively combines dereverberation and source separation.
Both methods improve speech quality and recognition scores.
Abstract
This paper develops a framework that can perform denoising, dereverberation, and source separation accurately by using a relatively small number of microphones. It has been empirically confirmed that Independent Vector Analysis (IVA) can blindly separate N sources from their sound mixture even with diffuse noise when a sufficiently large number (=M) of microphones are available (i.e., M>>N). However, the estimation accuracy seriously degrades as the number of microphones, or more specifically M-N (>=0), decreases. To overcome this limitation of IVA, we propose switching IVA (swIVA) in this paper. With swIVA, time frames of an observed signal with time-varying characteristics are clustered into several groups, each of which can be well handled by IVA using a small number of microphones, and thus accurate estimation can be achieved by applying IVA individually to each of the groups.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Speech and Audio Processing · Advanced Adaptive Filtering Techniques
